Daniel Povey
|
dea177fa63
|
Adjust min_positive and max_positive of NonlinAttentionModule
|
2022-12-08 20:31:37 +08:00 |
|
Daniel Povey
|
e77a12602c
|
Merge branch 'scaled_adam_exp639' into scaled_adam_exp652
|
2022-12-08 20:11:28 +08:00 |
|
Daniel Povey
|
d29e3d89e5
|
Use half the dimension in AttentionSqueeze.
|
2022-12-08 20:07:35 +08:00 |
|
Daniel Povey
|
f4f3d057e7
|
Cosmetic improvements to convolution module; enable more stats.
|
2022-12-08 18:27:01 +08:00 |
|
Daniel Povey
|
3f82ee0783
|
Merge dropout schedule, 0.3 ... 0.1 over 20k batches
|
2022-12-08 18:18:46 +08:00 |
|
Daniel Povey
|
51d8e64d66
|
Reduce min_abs on NonlinAttentionModule balancer2 to default from 0.5.
|
2022-12-06 15:32:56 +08:00 |
|
Daniel Povey
|
8f841e5b2b
|
Add another balancer for NonlinAttentionModule.
|
2022-12-06 11:11:28 +08:00 |
|
Daniel Povey
|
22617da725
|
Make dropout a schedule starting at 0.3.
|
2022-12-05 23:39:24 +08:00 |
|
Daniel Povey
|
178eca1c0e
|
Revert scaling, scale only grad.
|
2022-12-05 17:53:23 +08:00 |
|
Daniel Povey
|
b93cf0676a
|
Initialize Conv2dSubsampling with scale.
|
2022-12-05 17:31:56 +08:00 |
|
Daniel Povey
|
7999dd0dbe
|
Introduce scalar multiplication and change rules for updating gradient scale.
|
2022-12-05 16:15:20 +08:00 |
|
Daniel Povey
|
7b1f093077
|
Use Swoosh-R in the Conv and Swoosh-L in the feedforward.
|
2022-12-04 19:18:16 +08:00 |
|
Daniel Povey
|
d214e1c352
|
Change min_abs,max_abs from 2,10 to 1,5 for FF module
|
2022-12-04 16:15:38 +08:00 |
|
Daniel Povey
|
67812276ed
|
Change Swoosh formula so left crossing is near zero; change min_positive, max_positive of ActivationBalancer.
|
2022-12-03 15:10:03 +08:00 |
|
Daniel Povey
|
d5bfca4f49
|
Change activation of ConvolutionModule from Swoosh to Tanh, and min_abs from 1.0 to 0.75.
|
2022-12-03 14:50:09 +08:00 |
|
Daniel Povey
|
074f94f256
|
Increase ConvolutionModule min_abs from 0.4 to 1.0
|
2022-12-03 14:47:36 +08:00 |
|
Daniel Povey
|
306fd85bab
|
Reduce initial whitening-schedule limit from 7.5 to 5.0 for NonlinAttentionModule.
|
2022-12-03 14:40:39 +08:00 |
|
Daniel Povey
|
3faa4ffa3f
|
Revert "Change whitening limit of NonlinAttentionModule from _whitening_schedule(7.5), to 5.0"
This reverts commit dcf6fced40a831892e7bb4e52eddf26e6e070eef.
|
2022-12-03 00:19:48 +08:00 |
|
Daniel Povey
|
bd1b1dd7e3
|
Simplify formula for Swoosh and make it pass through 0; make max_abs of ConvolutionModule a constant.
|
2022-12-03 00:13:09 +08:00 |
|
Daniel Povey
|
862e5828c5
|
Set min_abs in balancer2 of ConvolutionModule to 0.4, and max_abs to 10.0 (this is a dontcare, really).
|
2022-12-02 21:02:06 +08:00 |
|
Daniel Povey
|
509467988f
|
Reduce min_abs in balancer2 of conv_module from 1.0 to 0.2.
|
2022-12-02 20:29:23 +08:00 |
|
Daniel Povey
|
14267a5194
|
Use Swoosh not DoubleSwish in zipformer; fix constants in Swoosh
|
2022-12-02 16:58:31 +08:00 |
|
Daniel Povey
|
85bd9859e9
|
Use different heads for nonlin/squeeze on alternate layers
|
2022-12-02 13:17:45 +08:00 |
|
Daniel Povey
|
d8185201e9
|
Merge branch 'scaled_adam_exp569' into scaled_adam_exp585
|
2022-12-01 19:14:26 +08:00 |
|
Daniel Povey
|
4f9bb332fb
|
Use fewer hidden channels in NonlinAttentionModule
|
2022-12-01 19:14:12 +08:00 |
|
Daniel Povey
|
dcf6fced40
|
Change whitening limit of NonlinAttentionModule from _whitening_schedule(7.5), to 5.0
|
2022-12-01 14:28:56 +08:00 |
|
Daniel Povey
|
025bcc155d
|
Change scale_min of Conv2dSubsampling from .01 to .1; some cosmetic changes/unimportant bugfixes.
|
2022-12-01 14:20:15 +08:00 |
|
Daniel Povey
|
ba31272c92
|
Change sigmoid to tanh in NonlinAttentionModule, and adjust abs limits of balancer to compensate.
|
2022-11-30 21:44:45 +08:00 |
|
Daniel Povey
|
c75c2dc91d
|
reduce min_abs of zipformer-encoder-layer balancer from 0.4 to 0.25.
|
2022-11-30 13:40:53 +08:00 |
|
Daniel Povey
|
d100aed58b
|
Revert "Reduce min of scale in Conv2dSubsampling from 0.01 to 0.2"
This reverts commit 7589e3768975df10c3d022beb4c88f14c2f25d3d.
|
2022-11-30 13:17:20 +08:00 |
|
Daniel Povey
|
640d48262f
|
Double scale on aux_loss
|
2022-11-29 16:20:21 +08:00 |
|
Daniel Povey
|
7589e37689
|
Reduce min of scale in Conv2dSubsampling from 0.01 to 0.2
|
2022-11-29 16:18:41 +08:00 |
|
Daniel Povey
|
441fcf063d
|
Reduce final value of bypass_min from 0.25 to 0.2
|
2022-11-29 16:15:34 +08:00 |
|
Daniel Povey
|
73e420865c
|
Revert min_abs in NonlinAttentionModule from 2.0 to 1.5
|
2022-11-29 15:53:29 +08:00 |
|
Daniel Povey
|
f48786534a
|
Merge branch 'scaled_adam_exp535' into scaled_adam_exp548
|
2022-11-29 15:43:44 +08:00 |
|
Daniel Povey
|
28c5923986
|
Remove max_factor=0.02 option in bottleneck balancer of class AttentionSqueeze, change its min,max positive to 0.2,0.8
|
2022-11-29 15:43:25 +08:00 |
|
Daniel Povey
|
5632782ee1
|
Merge branch 'scaled_adam_exp539' into scaled_adam_exp548
|
2022-11-29 15:40:23 +08:00 |
|
Daniel Povey
|
b90d8aabde
|
Revert the alternate-layers-only thing for nonlin_attention and attention_squeeze
|
2022-11-29 15:38:55 +08:00 |
|
Daniel Povey
|
753269668a
|
Change ratio in NonlinAttentionModule from 8 to 2
|
2022-11-29 15:38:13 +08:00 |
|
Daniel Povey
|
93942725c4
|
Increase min_abs of balancer of encoder layer from 0.2 to 0.4.
|
2022-11-29 13:46:47 +08:00 |
|
Daniel Povey
|
36a2f33a6f
|
Have value dim in NonlinAttentionModule be half of num_channels
|
2022-11-28 21:55:06 +08:00 |
|
Daniel Povey
|
258d4f1353
|
Let ratio be 8, not 2, for sigmoid in NonlinAttentionModule
|
2022-11-28 21:51:29 +08:00 |
|
Daniel Povey
|
7018c722b5
|
Let ratio of values to sigmoids be 8, not 2
|
2022-11-28 21:50:11 +08:00 |
|
Daniel Povey
|
643c547eec
|
Double just the value dim in NonlinAttentionLayer.
|
2022-11-28 20:56:47 +08:00 |
|
Daniel Povey
|
88bc45d596
|
Halve scale on aux_loss
|
2022-11-28 16:37:46 +08:00 |
|
Daniel Povey
|
cee62c823d
|
have final prob of aux_loss for input projections be 0
|
2022-11-28 16:36:17 +08:00 |
|
Daniel Povey
|
9cf5d92f39
|
Have nonlin_attention and attention_squeeze operate only on every other layer.
|
2022-11-28 16:24:24 +08:00 |
|
Daniel Povey
|
f483f1e0ef
|
Implement attention weights sharing for successive layers, for Zipformer
|
2022-11-28 13:41:11 +08:00 |
|
Daniel Povey
|
2a289d38b7
|
Make max_abs for feedforward module be a constant at 15.0
|
2022-11-28 13:19:37 +08:00 |
|
Daniel Povey
|
27a12a982b
|
Increase min_abs and max_abs in feedforward module.
|
2022-11-28 12:52:28 +08:00 |
|