Daniel Povey
|
0c3530a6fd
|
Merge branch 'scaled_adam_exp795' into scaled_adam_exp798
|
2022-12-30 14:29:48 +08:00 |
|
Daniel Povey
|
8056e0f9af
|
Make sure param_rms limit is effectively applied; fix tests in optim.py
|
2022-12-29 23:55:16 +08:00 |
|
Daniel Povey
|
e164393e91
|
Increase default grad_scale of Balancer from 0.02 to 0.04.
|
2022-12-29 21:38:19 +08:00 |
|
Daniel Povey
|
59be36181c
|
Replace ActivationBalancer with Balancer
|
2022-12-29 20:34:46 +08:00 |
|
Daniel Povey
|
c6bad1ee4f
|
Start ff modules with larger initial_scale
|
2022-12-29 18:50:12 +08:00 |
|
Daniel Povey
|
fbdb12cf77
|
Remove ZipformerEncoder.norm
|
2022-12-29 16:00:34 +08:00 |
|
Daniel Povey
|
0de1184c6d
|
Fix min_abs for AttentionSqueeze
|
2022-12-29 15:24:13 +08:00 |
|
Daniel Povey
|
a8282bb6d7
|
Adjust joiner and simple_lm/simple_am projections to account for larger activation dims
|
2022-12-29 12:52:11 +08:00 |
|
Daniel Povey
|
03e1f7dc01
|
Multiply min_abs values in line of encoder residuals by 4.
|
2022-12-29 12:49:04 +08:00 |
|
Daniel Povey
|
71d7843654
|
Re-introduce bias into BasicNorm and replace eps with log_scale.
|
2022-12-26 21:22:00 +08:00 |
|
Daniel Povey
|
920ed685ac
|
Change how bypass_scale works, src = src * bypass_scale + src_orig * (1.0 - bypass_scale)
|
2022-12-26 14:27:16 +08:00 |
|
Daniel Povey
|
11f5454b6a
|
Increase eps_max in norm_final from 3 to 4.
|
2022-12-26 13:47:23 +08:00 |
|
Daniel Povey
|
3d6ee443e3
|
Revert some recent changes that may not have been helpful.
|
2022-12-24 21:17:43 +08:00 |
|
Daniel Povey
|
43f2a8d50b
|
Make norm_final apply to delta, not src
|
2022-12-24 18:44:42 +08:00 |
|
Daniel Povey
|
2b50ce2247
|
Change eps range from -3..3 to -2 .. 2
|
2022-12-24 00:09:39 +08:00 |
|
Daniel Povey
|
72420eef10
|
Change final layerdrop_rate from 0.0 to 0.015.
|
2022-12-23 21:56:13 +08:00 |
|
Daniel Povey
|
2e0f4de8ff
|
Apply limit on BasicNorm.eps more effectively using limit_param_value; add final norm to Zipformer.
|
2022-12-23 15:59:51 +08:00 |
|
Daniel Povey
|
049174722f
|
Change BasicNorm by adding 1+eps denominator; fix to (unused) DoubleSwish, revert to old status.
|
2022-12-23 13:16:51 +08:00 |
|
Daniel Povey
|
cff350d8de
|
Merge branch 'scaled_adam_exp760' into scaled_adam_exp765
|
2022-12-23 13:08:09 +08:00 |
|
Daniel Povey
|
edd6e0faf1
|
Add whitening on ConvNeXt module outputs; change grad_scale on whiten of Conv2dSubsampling.
|
2022-12-23 11:35:11 +08:00 |
|
Daniel Povey
|
ade7db54e3
|
Revert BasicNorm to its previous status, without the bias
|
2022-12-22 23:47:21 +08:00 |
|
Daniel Povey
|
b2125535fb
|
Remove mistaken factor of 4.0
|
2022-12-22 23:19:16 +08:00 |
|
Daniel Povey
|
49bf3ddc66
|
Add whitening module at end of Conv2dSubsampling layer
|
2022-12-22 23:14:30 +08:00 |
|
Daniel Povey
|
2e6610af5e
|
Fix diagnostics.py re backoff for eigs
|
2022-12-22 23:14:28 +08:00 |
|
Daniel Povey
|
e5b047a814
|
Merge branch 'scaled_adam_exp759' into scaled_adam_exp760
|
2022-12-22 17:38:17 +08:00 |
|
Daniel Povey
|
56fcb14e18
|
Merge branch 'scaled_adam_exp758' into scaled_adam_exp759
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/scaling.py
|
2022-12-22 17:37:22 +08:00 |
|
Daniel Povey
|
1dbe1e4086
|
Bug fix regarding bias
|
2022-12-22 17:35:29 +08:00 |
|
Daniel Povey
|
180c440e63
|
Make BasicNorm after convnext1 operate over all frequency bins.
|
2022-12-22 17:25:30 +08:00 |
|
Daniel Povey
|
dd7257f01b
|
Replace 1st ConvNorm2d with BasicNorm, remove the 2nd.
|
2022-12-22 16:50:52 +08:00 |
|
Daniel Povey
|
a0b2276f68
|
Subtract bias after scaling
|
2022-12-22 15:45:45 +08:00 |
|
Daniel Povey
|
d31e2e12c6
|
Change for memory efficiency
|
2022-12-22 15:20:58 +08:00 |
|
Daniel Povey
|
903955f5d9
|
Add bias to BasicNorm
|
2022-12-22 15:14:49 +08:00 |
|
Daniel Povey
|
b39cde85c8
|
Implement bias in BasicNorm
|
2022-12-22 14:59:29 +08:00 |
|
Daniel Povey
|
5aa874d8e3
|
Change layerdrop schedule of convnext, now ends at 0
|
2022-12-21 23:58:13 +08:00 |
|
Daniel Povey
|
678be7a2eb
|
Revert ConvNorm1d to BasicNorm in Conv2dSubsampling and ZipformerLayer to BasicNorm
|
2022-12-21 23:53:13 +08:00 |
|
Daniel Povey
|
0995970f29
|
Decrease hidden_ratio of ConvNeXt from 4 to 3.
|
2022-12-21 18:43:11 +08:00 |
|
Daniel Povey
|
39e7c613c7
|
Add balancer to ConvNeXt
|
2022-12-21 18:41:05 +08:00 |
|
Daniel Povey
|
11f68afa1f
|
Revert "Remove memory-cutoff from ActivationBalancer."
This reverts commit 5afe0e78556e2e76750cae64008c9dd5c1931c5c.
|
2022-12-21 18:39:16 +08:00 |
|
Daniel Povey
|
829e4bd4db
|
Bug fix in save-bad-model code
|
2022-12-21 15:33:58 +08:00 |
|
Daniel Povey
|
788c4d97f1
|
Remove memory-cutoff from ActivationBalancer.
|
2022-12-21 15:09:26 +08:00 |
|
Daniel Povey
|
266e71cc79
|
Save checkpoint on failure.
|
2022-12-21 15:09:16 +08:00 |
|
Daniel Povey
|
96d167a2ec
|
Reduce floor on conv_min
|
2022-12-21 15:08:59 +08:00 |
|
Daniel Povey
|
05bcfd3b07
|
Make Whiten module update its prob every time
|
2022-12-21 12:56:37 +08:00 |
|
Daniel Povey
|
c097c13720
|
Change memory cutoff for ActivationBalancer; remove it for Whiten
|
2022-12-21 11:25:17 +08:00 |
|
Daniel Povey
|
4d61d39d36
|
Merge branch 'scaled_adam_exp747' into scaled_adam_exp748
|
2022-12-20 23:23:49 +08:00 |
|
Daniel Povey
|
3ef2a1d81e
|
Make some of the layer-skipping logic be done per sequence.
|
2022-12-20 22:26:30 +08:00 |
|
Daniel Povey
|
244633660d
|
Implement ConvNorm2d and use it in frontend after convnext
|
2022-12-20 20:28:03 +08:00 |
|
Daniel Povey
|
71880409cc
|
Bug fix; also make the final norm of Conv2dSubsampling a ConvNorm1d
|
2022-12-20 19:44:04 +08:00 |
|
Daniel Povey
|
3b4b33af58
|
Avoid infinities in padding frames
|
2022-12-20 19:19:45 +08:00 |
|
Daniel Povey
|
494139d27a
|
Replace BasicNorm of encoder layers with ConvNorm1d
|
2022-12-20 19:15:14 +08:00 |
|