Daniel Povey
|
f4ff6188d9
|
Set max_abs values on Conv2dSubsampling module.
|
2022-12-11 19:29:35 +08:00 |
|
Daniel Povey
|
a01fc3b220
|
Change attentionSqueeze dim from 128 to 256.
|
2022-12-11 19:12:03 +08:00 |
|
Daniel Povey
|
05c7cb5c83
|
Reduce attention_squeeze dim from 512 to 128.
|
2022-12-11 18:51:01 +08:00 |
|
Daniel Povey
|
634f1a4b82
|
Hardcode AttentionSqueeze dim at 512.
|
2022-12-11 17:20:52 +08:00 |
|
Daniel Povey
|
2d0fe7637c
|
Memory fix in WithLoss
|
2022-12-11 17:20:26 +08:00 |
|
Daniel Povey
|
0edaf4d25c
|
Merge branch 'scaled_adam_exp667' into scaled_adam_exp671
|
2022-12-10 19:39:02 +08:00 |
|
Daniel Povey
|
d7dd3f6dac
|
Merge branch 'scaled_adam_exp662' into scaled_adam_exp670
|
2022-12-10 18:04:21 +08:00 |
|
Daniel Povey
|
cb12014c31
|
Implement dropout for scores in AttentionDownsample
|
2022-12-10 16:09:51 +08:00 |
|
Daniel Povey
|
2f617fec43
|
Set nonlin_skip_rate to zero; make final min_abs value smaller in balancer2 of NonlinAttentionModule.
|
2022-12-10 00:21:51 +08:00 |
|
Daniel Povey
|
30c6e5b929
|
Make attention_squeeze use full dim.
|
2022-12-10 00:08:38 +08:00 |
|
Daniel Povey
|
0fc646f281
|
Merge branch 'scaled_adam_exp663' into scaled_adam_exp665
|
2022-12-10 00:07:37 +08:00 |
|
Daniel Povey
|
d35eb7a3a6
|
Add cosmetic/diagnostics changes from scaled_adam_exp656.
|
2022-12-09 22:02:42 +08:00 |
|
Daniel Povey
|
958d9b929d
|
Double limit of penalize_abs_values_gt in AttentionDownsample from 10 to 20.
|
2022-12-09 21:00:24 +08:00 |
|
Daniel Povey
|
a00ed7e976
|
Decrease min_abs of NonlinAttentionModule from 0.75 to 0.5; make its max_abs (not active) a constant.
|
2022-12-09 20:05:50 +08:00 |
|
Daniel Povey
|
a92df3e850
|
Reduce final min_abs on conv_module from 1.0 to 0.75.
|
2022-12-09 20:03:37 +08:00 |
|
Daniel Povey
|
31f2f95f59
|
reduce min_abs of ff module from 1.0 to 0.75
|
2022-12-09 20:01:19 +08:00 |
|
Daniel Povey
|
5c0957d950
|
Fix memory issue in ActivationBalancer
|
2022-12-09 18:11:27 +08:00 |
|
Daniel Povey
|
2ef0228db0
|
Make the ActivationBalancer relative to the mean, limited to -min_abs..max_abs
|
2022-12-09 17:59:00 +08:00 |
|
Daniel Povey
|
912adfff7c
|
Increase all ff dims by 256
|
2022-12-08 21:11:58 +08:00 |
|
Daniel Povey
|
75a1e05e49
|
Introduce nonlin_skip_rate
|
2022-12-08 20:35:38 +08:00 |
|
Daniel Povey
|
1718b2de44
|
Merge branch 'scaled_adam_exp647' into scaled_adam_exp652
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
|
2022-12-08 20:35:02 +08:00 |
|
Daniel Povey
|
dea177fa63
|
Adjust min_positive and max_positive of NonlinAttentionModule
|
2022-12-08 20:31:37 +08:00 |
|
Daniel Povey
|
e77a12602c
|
Merge branch 'scaled_adam_exp639' into scaled_adam_exp652
|
2022-12-08 20:11:28 +08:00 |
|
Daniel Povey
|
d29e3d89e5
|
Use half the dimension in AttentionSqueeze.
|
2022-12-08 20:07:35 +08:00 |
|
Daniel Povey
|
6e598cb18d
|
Reduce top grad_scale limit from 128 to 32.
|
2022-12-08 18:36:29 +08:00 |
|
Daniel Povey
|
f4f3d057e7
|
Cosmetic improvements to convolution module; enable more stats.
|
2022-12-08 18:27:01 +08:00 |
|
Daniel Povey
|
6845da4351
|
Add stddev stats in diagnostics.py
|
2022-12-08 18:21:09 +08:00 |
|
Daniel Povey
|
3f82ee0783
|
Merge dropout schedule, 0.3 ... 0.1 over 20k batches
|
2022-12-08 18:18:46 +08:00 |
|
Daniel Povey
|
5f5d02ed0c
|
Add another whitening module, move balancer to output.
|
2022-12-07 18:07:56 +08:00 |
|
Daniel Povey
|
8859177bfa
|
Move balancer of NonlinAttentionModule to output; add an extra whitener at output.
|
2022-12-07 17:49:53 +08:00 |
|
Daniel Povey
|
51d8e64d66
|
Reduce min_abs on NonlinAttentionModule balancer2 to default from 0.5.
|
2022-12-06 15:32:56 +08:00 |
|
Daniel Povey
|
8f841e5b2b
|
Add another balancer for NonlinAttentionModule.
|
2022-12-06 11:11:28 +08:00 |
|
Daniel Povey
|
63e881f89b
|
Pass in dropout from train.py
|
2022-12-05 23:49:40 +08:00 |
|
Daniel Povey
|
22617da725
|
Make dropout a schedule starting at 0.3.
|
2022-12-05 23:39:24 +08:00 |
|
Daniel Povey
|
0da228c587
|
Restore the computation of valid stats.
|
2022-12-05 19:50:25 +08:00 |
|
Daniel Povey
|
178eca1c0e
|
Revert scaling, scale only grad.
|
2022-12-05 17:53:23 +08:00 |
|
Daniel Povey
|
b93cf0676a
|
Initialize Conv2dSubsampling with scale.
|
2022-12-05 17:31:56 +08:00 |
|
Daniel Povey
|
7999dd0dbe
|
Introduce scalar multiplication and change rules for updating gradient scale.
|
2022-12-05 16:15:20 +08:00 |
|
Daniel Povey
|
12fb2081b1
|
Fix deriv code
|
2022-12-04 21:22:06 +08:00 |
|
Daniel Povey
|
c57eaf7979
|
Change x coeff from -0.1 to -0.08, as in 608.
|
2022-12-04 21:15:49 +08:00 |
|
Daniel Povey
|
7b1f093077
|
Use Swoosh-R in the Conv and Swoosh-L in the feedforward.
|
2022-12-04 19:18:16 +08:00 |
|
Daniel Povey
|
d214e1c352
|
Change min_abs,max_abs from 2,10 to 1,5 for FF module
|
2022-12-04 16:15:38 +08:00 |
|
Daniel Povey
|
67812276ed
|
Change Swoosh formula so left crossing is near zero; change min_positive, max_positive of ActivationBalancer.
|
2022-12-03 15:10:03 +08:00 |
|
Daniel Povey
|
d5bfca4f49
|
Change activation of ConvolutionModule from Swoosh to Tanh, and min_abs from 1.0 to 0.75.
|
2022-12-03 14:50:09 +08:00 |
|
Daniel Povey
|
074f94f256
|
Increase ConvolutionModule min_abs from 0.4 to 1.0
|
2022-12-03 14:47:36 +08:00 |
|
Daniel Povey
|
306fd85bab
|
Reduce initial whitening-schedule limit from 7.5 to 5.0 for NonlinAttentionModule.
|
2022-12-03 14:40:39 +08:00 |
|
Daniel Povey
|
b8e3091e04
|
Increase scale_gain_factor to 0.04.
|
2022-12-03 00:48:19 +08:00 |
|
Daniel Povey
|
183fc7a76d
|
Fix to diagnostics.py
|
2022-12-03 00:22:30 +08:00 |
|
Daniel Povey
|
3faa4ffa3f
|
Revert "Change whitening limit of NonlinAttentionModule from _whitening_schedule(7.5), to 5.0"
This reverts commit dcf6fced40a831892e7bb4e52eddf26e6e070eef.
|
2022-12-03 00:19:48 +08:00 |
|
Daniel Povey
|
bd1b1dd7e3
|
Simplify formula for Swoosh and make it pass through 0; make max_abs of ConvolutionModule a constant.
|
2022-12-03 00:13:09 +08:00 |
|