47 Commits

Author SHA1 Message Date
Daniel Povey
9672dffac2 Merge branch 'scaled_adam_exp168' into scaled_adam_exp169 2022-10-22 14:05:07 +08:00
Daniel Povey
bdbd2cfce6 Penalize too large weights in softmax of AttentionDownsample() 2022-10-21 20:12:36 +08:00
Daniel Povey
476fb9e9f3 Reduce min_prob of ActivationBalancer from 0.1 to 0.05. 2022-10-21 15:42:04 +08:00
Daniel Povey
6e6209419c Merge branch 'scaled_adam_exp150' into scaled_adam_exp155
# Conflicts:
#	egs/librispeech/ASR/pruned_transducer_stateless7/conformer.py
2022-10-20 15:04:27 +08:00
Daniel Povey
4565d43d5c Add hard limit of attention weights to +- 50 2022-10-20 14:28:22 +08:00
Daniel Povey
6601035db1 Reduce min_abs from 1.0e-04 to 5.0e-06 2022-10-20 13:53:10 +08:00
Daniel Povey
5a0914fdcf Merge branch 'scaled_adam_exp149' into scaled_adam_exp150 2022-10-20 13:31:22 +08:00
Daniel Povey
679ba2ee5e Remove debug print 2022-10-20 13:30:55 +08:00
Daniel Povey
610281eaa2 Keep just the RandomGrad changes, vs. 149. Git history may not reflect real changes. 2022-10-20 13:28:50 +08:00
Daniel Povey
d137118484 Get the randomized backprop for softmax in autocast mode working. 2022-10-20 13:23:48 +08:00
Daniel Povey
d75d646dc4 Merge branch 'scaled_adam_exp147' into scaled_adam_exp149 2022-10-20 12:59:50 +08:00
Daniel Povey
f6b8f0f631 Fix bug in backprop of random_clamp() 2022-10-20 12:49:29 +08:00
Daniel Povey
f08a869769 Merge branch 'scaled_adam_exp151' into scaled_adam_exp150 2022-10-19 19:59:07 +08:00
Daniel Povey
cc15552510 Use full precision to do softmax and store ans. 2022-10-19 19:53:53 +08:00
Daniel Povey
a4443efa95 Add RandomGrad with min_abs=1.0e-04 2022-10-19 19:46:17 +08:00
Daniel Povey
0ad4462632 Reduce min_abs from 1e-03 to 1e-04 2022-10-19 19:27:28 +08:00
Daniel Povey
ef5a27388f Merge branch 'scaled_adam_exp146' into scaled_adam_exp149 2022-10-19 19:16:27 +08:00
Daniel Povey
9c54906e63 Implement randomized backprop for softmax. 2022-10-19 19:16:03 +08:00
Daniel Povey
f4442de1c4 Add reflect=0.1 to invocations of random_clamp() 2022-10-19 12:34:26 +08:00
Daniel Povey
c3c655d0bd Random clip attention scores to -5..5. 2022-10-19 11:59:24 +08:00
Daniel Povey
6b3f9e5036 Changes to avoid bug in backward hooks, affecting diagnostics. 2022-10-19 11:06:17 +08:00
Daniel Povey
1135669e93 Bug fix RE float16 2022-10-16 10:58:22 +08:00
Daniel Povey
fc728f2738 Reorganize Whiten() code; configs are not the same as before. Also remove MaxEig for self_attn module 2022-10-15 23:20:18 +08:00
Daniel Povey
96023419da Reworking of ActivationBalancer code to hopefully balance speed and effectiveness. 2022-10-14 19:20:32 +08:00
Daniel Povey
5f375be159 Merge branch 'scaled_adam_exp103b2' into scaled_adam_exp103b4 2022-10-14 15:27:10 +08:00
Daniel Povey
15b91c12d6 Reduce stats period from 10 to 4. 2022-10-14 15:14:06 +08:00
Daniel Povey
db8b9919da Reduce beta from 0.75 to 0.0. 2022-10-14 15:12:59 +08:00
Daniel Povey
23d6bf7765 Fix bug when channel_dim < 0 2022-10-13 13:52:28 +08:00
Daniel Povey
49c6b6943d Change scale_factor_scale from 0.5 to 0.8 2022-10-12 20:55:52 +08:00
Daniel Povey
b736bb4840 Cosmetic improvements 2022-10-12 19:34:48 +08:00
Daniel Povey
12323025d7 Make ActivationBalancer and MaxEig more efficient. 2022-10-12 18:44:52 +08:00
Daniel Povey
d7f6e8eb51 Only apply ActivationBalancer with prob 0.25. 2022-10-10 00:26:31 +08:00
Daniel Povey
00841f0f49 Remove unused code LearnedScale. 2022-10-09 16:07:31 +08:00
Daniel Povey
3e137dda5b Decrease frequency of logging variance_proportion 2022-10-09 12:05:52 +08:00
Daniel Povey
93dff29243 Introduce a scale dependent on the masking value 2022-10-03 14:34:37 +08:00
Daniel Povey
76e66408c5 Some cosmetic improvements 2022-09-27 11:08:44 +08:00
Daniel Povey
ceadfad48d Reduce debug freq 2022-09-22 12:30:49 +08:00
Daniel Povey
db1f4ccdd1 4x scale on max-eig constraint 2022-09-20 14:20:13 +08:00
Daniel Povey
3d72a65de8 Implement max-eig-proportion.. 2022-09-19 10:26:37 +08:00
Daniel Povey
0f567e27a5 Add max_var_per_eig in self-attn 2022-09-18 21:22:01 +08:00
Daniel Povey
4a2b940321 Remove StructuredLinear,StructuredConv1d 2022-09-17 13:14:08 +08:00
Daniel Povey
1a184596b6 A little code refactoring 2022-09-16 20:56:21 +08:00
Daniel Povey
9d7af4be20 Modify scaling.py to prevent constant values 2022-07-29 09:34:13 +08:00
Daniel Povey
7f0756e156 Implement structured version of conformer 2022-06-17 15:10:21 +08:00
Daniel Povey
ca7cffcb42 Remove Decorrelate() class 2022-06-13 16:08:32 +08:00
Daniel Povey
d301f8ac6c Merge Decorrelate work, and simplification to RandomCombine, into pruned_transducer_stateless7 2022-06-11 11:07:07 +08:00
Daniel Povey
741dcd1d6d Move pruned_transducer_stateless4 to pruned_transducer_stateless7 2022-05-31 12:45:28 +08:00