1779 Commits

Author SHA1 Message Date
Daniel Povey
b93cf0676a Initialize Conv2dSubsampling with scale. 2022-12-05 17:31:56 +08:00
Daniel Povey
7999dd0dbe Introduce scalar multiplication and change rules for updating gradient scale. 2022-12-05 16:15:20 +08:00
Daniel Povey
12fb2081b1 Fix deriv code 2022-12-04 21:22:06 +08:00
Daniel Povey
c57eaf7979 Change x coeff from -0.1 to -0.08, as in 608. 2022-12-04 21:15:49 +08:00
Daniel Povey
7b1f093077 Use Swoosh-R in the Conv and Swoosh-L in the feedforward. 2022-12-04 19:18:16 +08:00
Daniel Povey
d214e1c352 Change min_abs,max_abs from 2,10 to 1,5 for FF module 2022-12-04 16:15:38 +08:00
Daniel Povey
67812276ed Change Swoosh formula so left crossing is near zero; change min_positive, max_positive of ActivationBalancer. 2022-12-03 15:10:03 +08:00
Daniel Povey
d5bfca4f49 Change activation of ConvolutionModule from Swoosh to Tanh, and min_abs from 1.0 to 0.75. 2022-12-03 14:50:09 +08:00
Daniel Povey
074f94f256 Increase ConvolutionModule min_abs from 0.4 to 1.0 2022-12-03 14:47:36 +08:00
Daniel Povey
306fd85bab Reduce initial whitening-schedule limit from 7.5 to 5.0 for NonlinAttentionModule. 2022-12-03 14:40:39 +08:00
Daniel Povey
b8e3091e04 Increase scale_gain_factor to 0.04. 2022-12-03 00:48:19 +08:00
Daniel Povey
3faa4ffa3f Revert "Change whitening limit of NonlinAttentionModule from _whitening_schedule(7.5), to 5.0"
This reverts commit dcf6fced40a831892e7bb4e52eddf26e6e070eef.
2022-12-03 00:19:48 +08:00
Daniel Povey
bd1b1dd7e3 Simplify formula for Swoosh and make it pass through 0; make max_abs of ConvolutionModule a constant. 2022-12-03 00:13:09 +08:00
Daniel Povey
862e5828c5 Set min_abs in balancer2 of ConvolutionModule to 0.4, and max_abs to 10.0 (this is a dontcare, really). 2022-12-02 21:02:06 +08:00
Daniel Povey
509467988f Reduce min_abs in balancer2 of conv_module from 1.0 to 0.2. 2022-12-02 20:29:23 +08:00
Daniel Povey
84f51ab1b1 Bug fix in scripting mode 2022-12-02 20:28:17 +08:00
Daniel Povey
9a2a58e20d Fix bug one versus zero 2022-12-02 19:12:18 +08:00
Daniel Povey
2bfc38207c Fix constants in SwooshFunction. 2022-12-02 18:37:23 +08:00
Daniel Povey
14267a5194 Use Swoosh not DoubleSwish in zipformer; fix constants in Swoosh 2022-12-02 16:58:31 +08:00
Daniel Povey
ec10573edc First version of swoosh 2022-12-02 16:34:53 +08:00
Daniel Povey
d260b54177 Subtract, not add, 0.025. 2022-12-02 15:55:48 +08:00
Daniel Povey
9a71406a46 Reduce offset from 0.075 to 0.025. 2022-12-02 15:40:21 +08:00
Daniel Povey
c71a3c6098 Change offset 2022-12-02 15:20:37 +08:00
Daniel Povey
f0f204552d Add -0.05 to DoubleSwish. 2022-12-02 15:17:41 +08:00
Daniel Povey
4afd95d822 Merge branch 'scaled_adam_exp583' into scaled_adam_exp592 2022-12-02 15:14:11 +08:00
Daniel Povey
85bd9859e9 Use different heads for nonlin/squeeze on alternate layers 2022-12-02 13:17:45 +08:00
Daniel Povey
d8185201e9 Merge branch 'scaled_adam_exp569' into scaled_adam_exp585 2022-12-01 19:14:26 +08:00
Daniel Povey
4f9bb332fb Use fewer hidden channels in NonlinAttentionModule 2022-12-01 19:14:12 +08:00
Daniel Povey
983a690c63 Change DoubleSwish formulation, add alpha*x only for x.abs() > 0.15. 2022-12-01 17:20:56 +08:00
Daniel Povey
d294449221 Fix typo 0.21->0.2 2022-12-01 15:29:46 +08:00
Daniel Povey
f0c46ce564 Double nonlin_skip_rate and have it last twice longer. 2022-12-01 15:28:44 +08:00
Daniel Povey
cac1a8b860 Merge branch 'scaled_adam_exp569' into scaled_adam_exp576 2022-12-01 15:20:20 +08:00
Daniel Povey
4621e924ba Introduce dropout schedule for NonlinAttentionModule 2022-12-01 15:19:51 +08:00
Daniel Povey
dcf6fced40 Change whitening limit of NonlinAttentionModule from _whitening_schedule(7.5), to 5.0 2022-12-01 14:28:56 +08:00
Daniel Povey
025bcc155d Change scale_min of Conv2dSubsampling from .01 to .1; some cosmetic changes/unimportant bugfixes. 2022-12-01 14:20:15 +08:00
Daniel Povey
ba31272c92 Change sigmoid to tanh in NonlinAttentionModule, and adjust abs limits of balancer to compensate. 2022-11-30 21:44:45 +08:00
Daniel Povey
d682ecc246 Introduce alpha for DoubleSwish, set it to -0.05. 2022-11-30 18:58:25 +08:00
Daniel Povey
c75c2dc91d reduce min_abs of zipformer-encoder-layer balancer from 0.4 to 0.25. 2022-11-30 13:40:53 +08:00
Daniel Povey
d100aed58b Revert "Reduce min of scale in Conv2dSubsampling from 0.01 to 0.2"
This reverts commit 7589e3768975df10c3d022beb4c88f14c2f25d3d.
2022-11-30 13:17:20 +08:00
Daniel Povey
12e8c3f0fa One more layer on input 2022-11-29 16:47:24 +08:00
Daniel Povey
640d48262f Double scale on aux_loss 2022-11-29 16:20:21 +08:00
Daniel Povey
7589e37689 Reduce min of scale in Conv2dSubsampling from 0.01 to 0.2 2022-11-29 16:18:41 +08:00
Daniel Povey
441fcf063d Reduce final value of bypass_min from 0.25 to 0.2 2022-11-29 16:15:34 +08:00
Daniel Povey
73e420865c Revert min_abs in NonlinAttentionModule from 2.0 to 1.5 2022-11-29 15:53:29 +08:00
Daniel Povey
f48786534a Merge branch 'scaled_adam_exp535' into scaled_adam_exp548 2022-11-29 15:43:44 +08:00
Daniel Povey
28c5923986 Remove max_factor=0.02 option in bottleneck balancer of class AttentionSqueeze, change its min,max positive to 0.2,0.8 2022-11-29 15:43:25 +08:00
Daniel Povey
5632782ee1 Merge branch 'scaled_adam_exp539' into scaled_adam_exp548 2022-11-29 15:40:23 +08:00
Daniel Povey
b90d8aabde Revert the alternate-layers-only thing for nonlin_attention and attention_squeeze 2022-11-29 15:38:55 +08:00
Daniel Povey
753269668a Change ratio in NonlinAttentionModule from 8 to 2 2022-11-29 15:38:13 +08:00
Daniel Povey
93942725c4 Increase min_abs of balancer of encoder layer from 0.2 to 0.4. 2022-11-29 13:46:47 +08:00