Daniel Povey
|
6601035db1
|
Reduce min_abs from 1.0e-04 to 5.0e-06
|
2022-10-20 13:53:10 +08:00 |
|
Daniel Povey
|
5a0914fdcf
|
Merge branch 'scaled_adam_exp149' into scaled_adam_exp150
|
2022-10-20 13:31:22 +08:00 |
|
Daniel Povey
|
679ba2ee5e
|
Remove debug print
|
2022-10-20 13:30:55 +08:00 |
|
Daniel Povey
|
610281eaa2
|
Keep just the RandomGrad changes, vs. 149. Git history may not reflect real changes.
|
2022-10-20 13:28:50 +08:00 |
|
Daniel Povey
|
d137118484
|
Get the randomized backprop for softmax in autocast mode working.
|
2022-10-20 13:23:48 +08:00 |
|
Daniel Povey
|
d75d646dc4
|
Merge branch 'scaled_adam_exp147' into scaled_adam_exp149
|
2022-10-20 12:59:50 +08:00 |
|
Daniel Povey
|
f6b8f0f631
|
Fix bug in backprop of random_clamp()
|
2022-10-20 12:49:29 +08:00 |
|
Daniel Povey
|
f08a869769
|
Merge branch 'scaled_adam_exp151' into scaled_adam_exp150
|
2022-10-19 19:59:07 +08:00 |
|
Daniel Povey
|
cc15552510
|
Use full precision to do softmax and store ans.
|
2022-10-19 19:53:53 +08:00 |
|
Daniel Povey
|
a4443efa95
|
Add RandomGrad with min_abs=1.0e-04
|
2022-10-19 19:46:17 +08:00 |
|
Daniel Povey
|
0ad4462632
|
Reduce min_abs from 1e-03 to 1e-04
|
2022-10-19 19:27:28 +08:00 |
|
Daniel Povey
|
ef5a27388f
|
Merge branch 'scaled_adam_exp146' into scaled_adam_exp149
|
2022-10-19 19:16:27 +08:00 |
|
Daniel Povey
|
9c54906e63
|
Implement randomized backprop for softmax.
|
2022-10-19 19:16:03 +08:00 |
|
Daniel Povey
|
f4442de1c4
|
Add reflect=0.1 to invocations of random_clamp()
|
2022-10-19 12:34:26 +08:00 |
|
Daniel Povey
|
c3c655d0bd
|
Random clip attention scores to -5..5.
|
2022-10-19 11:59:24 +08:00 |
|
Daniel Povey
|
6b3f9e5036
|
Changes to avoid bug in backward hooks, affecting diagnostics.
|
2022-10-19 11:06:17 +08:00 |
|
Daniel Povey
|
1135669e93
|
Bug fix RE float16
|
2022-10-16 10:58:22 +08:00 |
|
Daniel Povey
|
fc728f2738
|
Reorganize Whiten() code; configs are not the same as before. Also remove MaxEig for self_attn module
|
2022-10-15 23:20:18 +08:00 |
|
Daniel Povey
|
96023419da
|
Reworking of ActivationBalancer code to hopefully balance speed and effectiveness.
|
2022-10-14 19:20:32 +08:00 |
|
Daniel Povey
|
5f375be159
|
Merge branch 'scaled_adam_exp103b2' into scaled_adam_exp103b4
|
2022-10-14 15:27:10 +08:00 |
|
Daniel Povey
|
15b91c12d6
|
Reduce stats period from 10 to 4.
|
2022-10-14 15:14:06 +08:00 |
|
Daniel Povey
|
db8b9919da
|
Reduce beta from 0.75 to 0.0.
|
2022-10-14 15:12:59 +08:00 |
|
Daniel Povey
|
23d6bf7765
|
Fix bug when channel_dim < 0
|
2022-10-13 13:52:28 +08:00 |
|
Daniel Povey
|
49c6b6943d
|
Change scale_factor_scale from 0.5 to 0.8
|
2022-10-12 20:55:52 +08:00 |
|
Daniel Povey
|
b736bb4840
|
Cosmetic improvements
|
2022-10-12 19:34:48 +08:00 |
|
Daniel Povey
|
12323025d7
|
Make ActivationBalancer and MaxEig more efficient.
|
2022-10-12 18:44:52 +08:00 |
|
Daniel Povey
|
d7f6e8eb51
|
Only apply ActivationBalancer with prob 0.25.
|
2022-10-10 00:26:31 +08:00 |
|
Daniel Povey
|
00841f0f49
|
Remove unused code LearnedScale.
|
2022-10-09 16:07:31 +08:00 |
|
Daniel Povey
|
3e137dda5b
|
Decrease frequency of logging variance_proportion
|
2022-10-09 12:05:52 +08:00 |
|
Daniel Povey
|
93dff29243
|
Introduce a scale dependent on the masking value
|
2022-10-03 14:34:37 +08:00 |
|
Daniel Povey
|
76e66408c5
|
Some cosmetic improvements
|
2022-09-27 11:08:44 +08:00 |
|
Daniel Povey
|
ceadfad48d
|
Reduce debug freq
|
2022-09-22 12:30:49 +08:00 |
|
Daniel Povey
|
db1f4ccdd1
|
4x scale on max-eig constraint
|
2022-09-20 14:20:13 +08:00 |
|
Daniel Povey
|
3d72a65de8
|
Implement max-eig-proportion..
|
2022-09-19 10:26:37 +08:00 |
|
Daniel Povey
|
0f567e27a5
|
Add max_var_per_eig in self-attn
|
2022-09-18 21:22:01 +08:00 |
|
Daniel Povey
|
4a2b940321
|
Remove StructuredLinear,StructuredConv1d
|
2022-09-17 13:14:08 +08:00 |
|
Daniel Povey
|
1a184596b6
|
A little code refactoring
|
2022-09-16 20:56:21 +08:00 |
|
Daniel Povey
|
9d7af4be20
|
Modify scaling.py to prevent constant values
|
2022-07-29 09:34:13 +08:00 |
|
Daniel Povey
|
7f0756e156
|
Implement structured version of conformer
|
2022-06-17 15:10:21 +08:00 |
|
Daniel Povey
|
ca7cffcb42
|
Remove Decorrelate() class
|
2022-06-13 16:08:32 +08:00 |
|
Daniel Povey
|
d301f8ac6c
|
Merge Decorrelate work, and simplification to RandomCombine, into pruned_transducer_stateless7
|
2022-06-11 11:07:07 +08:00 |
|
Daniel Povey
|
741dcd1d6d
|
Move pruned_transducer_stateless4 to pruned_transducer_stateless7
|
2022-05-31 12:45:28 +08:00 |
|