142 Commits

Author SHA1 Message Date
Daniel Povey
903955f5d9 Add bias to BasicNorm 2022-12-22 15:14:49 +08:00
Daniel Povey
b39cde85c8 Implement bias in BasicNorm 2022-12-22 14:59:29 +08:00
Daniel Povey
11f68afa1f Revert "Remove memory-cutoff from ActivationBalancer."
This reverts commit 5afe0e78556e2e76750cae64008c9dd5c1931c5c.
2022-12-21 18:39:16 +08:00
Daniel Povey
788c4d97f1 Remove memory-cutoff from ActivationBalancer. 2022-12-21 15:09:26 +08:00
Daniel Povey
96d167a2ec Reduce floor on conv_min 2022-12-21 15:08:59 +08:00
Daniel Povey
05bcfd3b07 Make Whiten module update its prob every time 2022-12-21 12:56:37 +08:00
Daniel Povey
c097c13720 Change memory cutoff for ActivationBalancer; remove it for Whiten 2022-12-21 11:25:17 +08:00
Daniel Povey
244633660d Implement ConvNorm2d and use it in frontend after convnext 2022-12-20 20:28:03 +08:00
Daniel Povey
71880409cc Bug fix; also make the final norm of Conv2dSubsampling a ConvNorm1d 2022-12-20 19:44:04 +08:00
Daniel Povey
3b4b33af58 Avoid infinities in padding frames 2022-12-20 19:19:45 +08:00
Daniel Povey
494139d27a Replace BasicNorm of encoder layers with ConvNorm1d 2022-12-20 19:15:14 +08:00
Daniel Povey
5e1bf8b8ec Add BasicNorm to ConvNeXt; increase prob given to CutoffEstimator; adjust default probs of ActivationBalancer. 2022-12-18 14:14:15 +08:00
Daniel Povey
dfeafd6aa8 Remove print statement in CutoffEstimator 2022-12-17 16:28:45 +08:00
Daniel Povey
29df07ba2c Add memory cutoff on ActivationBalancer and Whiten 2022-12-17 16:20:15 +08:00
Daniel Povey
744dca1c9b Merge branch 'scaled_adam_exp724' into scaled_adam_exp726 2022-12-17 15:46:57 +08:00
Daniel Povey
b9326e1ef2 Fix to print statement 2022-12-16 18:07:43 +08:00
Daniel Povey
8e6c7ef3e2 Adjust default prob of ActivationBalancer. 2022-12-16 15:08:46 +08:00
Daniel Povey
56ac7354df Remove LinearWithAuxLoss; simplify schedule of prob in ActivationBalancer. 2022-12-16 15:07:42 +08:00
Daniel Povey
083e5474c4 Reduce ConvNeXt parameters. 2022-12-16 00:21:04 +08:00
Daniel Povey
8d9301e225 Remove potentially wrong typing info 2022-12-15 23:47:41 +08:00
Daniel Povey
6caaa4e9c6 Bug fix in caching_eval, may make no difference. 2022-12-15 23:32:29 +08:00
Daniel Povey
f5d4fb092d Bug fix in caching_eval 2022-12-15 23:24:36 +08:00
Daniel Povey
d26ee2bf81 Try to implement caching evaluation for memory efficient training 2022-12-15 23:06:40 +08:00
Daniel Povey
f66c1600f4 Bug fix to printing code 2022-12-15 21:55:23 +08:00
Daniel Povey
2d0fe7637c Memory fix in WithLoss 2022-12-11 17:20:26 +08:00
Daniel Povey
0fc646f281 Merge branch 'scaled_adam_exp663' into scaled_adam_exp665 2022-12-10 00:07:37 +08:00
Daniel Povey
d35eb7a3a6 Add cosmetic/diagnostics changes from scaled_adam_exp656. 2022-12-09 22:02:42 +08:00
Daniel Povey
5c0957d950 Fix memory issue in ActivationBalancer 2022-12-09 18:11:27 +08:00
Daniel Povey
2ef0228db0 Make the ActivationBalancer relative to the mean, limited to -min_abs..max_abs 2022-12-09 17:59:00 +08:00
Daniel Povey
3f82ee0783 Merge dropout schedule, 0.3 ... 0.1 over 20k batches 2022-12-08 18:18:46 +08:00
Daniel Povey
22617da725 Make dropout a schedule starting at 0.3. 2022-12-05 23:39:24 +08:00
Daniel Povey
178eca1c0e Revert scaling, scale only grad. 2022-12-05 17:53:23 +08:00
Daniel Povey
b93cf0676a Initialize Conv2dSubsampling with scale. 2022-12-05 17:31:56 +08:00
Daniel Povey
12fb2081b1 Fix deriv code 2022-12-04 21:22:06 +08:00
Daniel Povey
c57eaf7979 Change x coeff from -0.1 to -0.08, as in 608. 2022-12-04 21:15:49 +08:00
Daniel Povey
7b1f093077 Use Swoosh-R in the Conv and Swoosh-L in the feedforward. 2022-12-04 19:18:16 +08:00
Daniel Povey
67812276ed Change Swoosh formula so left crossing is near zero; change min_positive, max_positive of ActivationBalancer. 2022-12-03 15:10:03 +08:00
Daniel Povey
b8e3091e04 Increase scale_gain_factor to 0.04. 2022-12-03 00:48:19 +08:00
Daniel Povey
bd1b1dd7e3 Simplify formula for Swoosh and make it pass through 0; make max_abs of ConvolutionModule a constant. 2022-12-03 00:13:09 +08:00
Daniel Povey
84f51ab1b1 Bug fix in scripting mode 2022-12-02 20:28:17 +08:00
Daniel Povey
9a2a58e20d Fix bug one versus zero 2022-12-02 19:12:18 +08:00
Daniel Povey
2bfc38207c Fix constants in SwooshFunction. 2022-12-02 18:37:23 +08:00
Daniel Povey
14267a5194 Use Swoosh not DoubleSwish in zipformer; fix constants in Swoosh 2022-12-02 16:58:31 +08:00
Daniel Povey
ec10573edc First version of swoosh 2022-12-02 16:34:53 +08:00
Daniel Povey
d260b54177 Subtract, not add, 0.025. 2022-12-02 15:55:48 +08:00
Daniel Povey
9a71406a46 Reduce offset from 0.075 to 0.025. 2022-12-02 15:40:21 +08:00
Daniel Povey
c71a3c6098 Change offset 2022-12-02 15:20:37 +08:00
Daniel Povey
f0f204552d Add -0.05 to DoubleSwish. 2022-12-02 15:17:41 +08:00
Daniel Povey
983a690c63 Change DoubleSwish formulation, add alpha*x only for x.abs() > 0.15. 2022-12-01 17:20:56 +08:00
Daniel Povey
d682ecc246 Introduce alpha for DoubleSwish, set it to -0.05. 2022-11-30 18:58:25 +08:00