28 Commits

Author SHA1 Message Date
Daniel Povey
d398f0ed70 Decrease random_prob from 0.5 to 0.333 2022-09-29 13:55:33 +08:00
Daniel Povey
461ad3655a Implement AttentionCombine as replacement for RandomCombine 2022-09-29 13:44:03 +08:00
Daniel Povey
e5a0d8929b Remove unused out_balancer member 2022-09-27 13:10:59 +08:00
Daniel Povey
6b12f20995 Remove out_balancer and out_norm from conv modules 2022-09-27 12:25:11 +08:00
Daniel Povey
71b3756ada Use half the dim per head, in self_attn layers. 2022-09-24 15:40:44 +08:00
Daniel Povey
ce3f59d9c7 Use dropout in attention, on attn weights. 2022-09-22 19:18:50 +08:00
Daniel Povey
24aea947d2 Fix issues where grad is None, and unused-grad cases 2022-09-22 19:18:16 +08:00
Daniel Povey
c16f795962 Avoid error in ddp by using last module'sc scores 2022-09-22 18:52:16 +08:00
Daniel Povey
0f85a3c2e5 Implement persistent attention scores 2022-09-22 18:47:16 +08:00
Daniel Povey
1d20c12bc0 Increase max_var_per_eig to 0.2 2022-09-22 12:28:35 +08:00
Daniel Povey
6eb9a0bc9b Halve max_var_per_eig to 0.05 2022-09-20 14:39:17 +08:00
Daniel Povey
cd5ac76a05 Add max-var-per-eig in encoder layers 2022-09-20 14:22:07 +08:00
Daniel Povey
3d72a65de8 Implement max-eig-proportion.. 2022-09-19 10:26:37 +08:00
Daniel Povey
0f567e27a5 Add max_var_per_eig in self-attn 2022-09-18 21:22:01 +08:00
Daniel Povey
76031a7c1d Loosen some limits of activation balancers 2022-09-18 13:59:44 +08:00
Daniel Povey
3122637266 Use ScaledLinear where I previously had StructuredLinear 2022-09-17 13:18:58 +08:00
Daniel Povey
1a184596b6 A little code refactoring 2022-09-16 20:56:21 +08:00
Daniel Povey
e1182da6ac Restoring min_abs and max_abs defaults for the linear_pos proj. 2022-07-31 05:07:50 +08:00
Daniel Povey
3857a87b47 Merge branch 'merge_refactor_param_cov_norank1_iter_batch_max4.0_pow0.5_fix2r_lrupdate200_2k_ns' into merge2_refactor_max4.0_pow0.5_200_1k_ma3.0 2022-07-17 15:32:43 +08:00
Daniel Povey
f36ebad618 Remove 2/3 StructuredLinear/StructuredConv1d modules, use linear/conv1d 2022-07-17 06:40:19 +08:00
Daniel Povey
de1fd91435 Adding max_abs=3.0 to ActivationBalancer modules inside feedoforward modules. 2022-07-16 07:19:26 +08:00
Daniel Povey
7f0756e156 Implement structured version of conformer 2022-06-17 15:10:21 +08:00
Daniel Povey
7338c60296 Remove Decorrelate() 2022-06-13 16:07:15 +08:00
Daniel Povey
d301f8ac6c Merge Decorrelate work, and simplification to RandomCombine, into pruned_transducer_stateless7 2022-06-11 11:07:07 +08:00
Daniel Povey
bc5c782294 Limit magnitude of linear_pos 2022-06-01 10:40:54 +08:00
Daniel Povey
61619c031e Add activation balancer to stop activations in self_attn from getting too large 2022-06-01 00:40:45 +08:00
Daniel Povey
1651fe0d42 Merge changes from pruned_transducer_stateless4->5 2022-05-31 13:00:11 +08:00
Daniel Povey
741dcd1d6d Move pruned_transducer_stateless4 to pruned_transducer_stateless7 2022-05-31 12:45:28 +08:00