icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Daniel Povey	5fe8cb134f	Remove final combination; implement layer drop that drops the final layers.	2022-10-04 22:19:44 +08:00
Daniel Povey	006fcc18cd	Introduce offset in layerdrop_scaleS	2022-10-04 12:06:35 +08:00
Daniel Povey	33c24e4114	Bug fix	2022-10-03 23:07:30 +08:00
Daniel Povey	a9f950a1f7	Make the scaling factors more global and the randomness of dropout more random	2022-10-03 22:49:32 +08:00
Daniel Povey	88d0da7192	Simplify the learned scaling factor on the modules	2022-10-03 17:54:56 +08:00
Daniel Povey	b3af9f67ae	Implement efficient layer dropout	2022-10-03 17:19:16 +08:00
Daniel Povey	93dff29243	Introduce a scale dependent on the masking value	2022-10-03 14:34:37 +08:00
Daniel Povey	1be455438a	Decrease feature_mask_dropout_prob back from 0.2 to 0.15, i.e. revert the 43->48 change.	2022-10-02 14:00:36 +08:00
Daniel Povey	cf5f7e5dfd	Swap random_prob and single_prob, to reduce prob of being randomized.	2022-10-01 23:50:38 +08:00
Daniel Povey	8d517a69e4	Increase feature_mask_dropout_prob from 0.15 to 0.2.	2022-10-01 23:32:24 +08:00
Daniel Povey	e9326a7d16	Remove dropout from inside ConformerEncoderLayer, for adding to residuals	2022-10-01 13:13:10 +08:00
Daniel Povey	cc64f2f15c	Reduce feature_mask_dropout_prob from 0.25 to 0.15.	2022-10-01 12:24:07 +08:00
Daniel Povey	1eb603f4ad	Reduce single_prob from 0.5 to 0.25	2022-09-30 22:14:53 +08:00
Daniel Povey	ab7c940803	Include changes from Liyong about padding conformer module.	2022-09-30 18:37:31 +08:00
Daniel Povey	38f89053bd	Introduce feature mask per frame	2022-09-29 17:31:04 +08:00
Daniel Povey	056b9a4f9a	Apply single_prob mask, so sometimes we just get one layer as output.	2022-09-29 15:29:37 +08:00
Daniel Povey	d8f7310118	Add print statement	2022-09-29 14:15:29 +08:00
Daniel Povey	d398f0ed70	Decrease random_prob from 0.5 to 0.333	2022-09-29 13:55:33 +08:00
Daniel Povey	461ad3655a	Implement AttentionCombine as replacement for RandomCombine	2022-09-29 13:44:03 +08:00
Daniel Povey	e5a0d8929b	Remove unused out_balancer member	2022-09-27 13:10:59 +08:00
Daniel Povey	6b12f20995	Remove out_balancer and out_norm from conv modules	2022-09-27 12:25:11 +08:00
Daniel Povey	71b3756ada	Use half the dim per head, in self_attn layers.	2022-09-24 15:40:44 +08:00
Daniel Povey	ce3f59d9c7	Use dropout in attention, on attn weights.	2022-09-22 19:18:50 +08:00
Daniel Povey	24aea947d2	Fix issues where grad is None, and unused-grad cases	2022-09-22 19:18:16 +08:00
Daniel Povey	c16f795962	Avoid error in ddp by using last module'sc scores	2022-09-22 18:52:16 +08:00
Daniel Povey	0f85a3c2e5	Implement persistent attention scores	2022-09-22 18:47:16 +08:00
Daniel Povey	1d20c12bc0	Increase max_var_per_eig to 0.2	2022-09-22 12:28:35 +08:00
Daniel Povey	6eb9a0bc9b	Halve max_var_per_eig to 0.05	2022-09-20 14:39:17 +08:00
Daniel Povey	cd5ac76a05	Add max-var-per-eig in encoder layers	2022-09-20 14:22:07 +08:00
Daniel Povey	3d72a65de8	Implement max-eig-proportion..	2022-09-19 10:26:37 +08:00
Daniel Povey	0f567e27a5	Add max_var_per_eig in self-attn	2022-09-18 21:22:01 +08:00
Daniel Povey	76031a7c1d	Loosen some limits of activation balancers	2022-09-18 13:59:44 +08:00
Daniel Povey	3122637266	Use ScaledLinear where I previously had StructuredLinear	2022-09-17 13:18:58 +08:00
Daniel Povey	1a184596b6	A little code refactoring	2022-09-16 20:56:21 +08:00
Daniel Povey	e1182da6ac	Restoring min_abs and max_abs defaults for the linear_pos proj.	2022-07-31 05:07:50 +08:00
Daniel Povey	3857a87b47	Merge branch 'merge_refactor_param_cov_norank1_iter_batch_max4.0_pow0.5_fix2r_lrupdate200_2k_ns' into merge2_refactor_max4.0_pow0.5_200_1k_ma3.0	2022-07-17 15:32:43 +08:00
Daniel Povey	f36ebad618	Remove 2/3 StructuredLinear/StructuredConv1d modules, use linear/conv1d	2022-07-17 06:40:19 +08:00
Daniel Povey	de1fd91435	Adding max_abs=3.0 to ActivationBalancer modules inside feedoforward modules.	2022-07-16 07:19:26 +08:00
Daniel Povey	7f0756e156	Implement structured version of conformer	2022-06-17 15:10:21 +08:00
Daniel Povey	7338c60296	Remove Decorrelate()	2022-06-13 16:07:15 +08:00
Daniel Povey	d301f8ac6c	Merge Decorrelate work, and simplification to RandomCombine, into pruned_transducer_stateless7	2022-06-11 11:07:07 +08:00
Daniel Povey	bc5c782294	Limit magnitude of linear_pos	2022-06-01 10:40:54 +08:00
Daniel Povey	61619c031e	Add activation balancer to stop activations in self_attn from getting too large	2022-06-01 00:40:45 +08:00
Daniel Povey	1651fe0d42	Merge changes from pruned_transducer_stateless4->5	2022-05-31 13:00:11 +08:00
Daniel Povey	741dcd1d6d	Move pruned_transducer_stateless4 to pruned_transducer_stateless7	2022-05-31 12:45:28 +08:00

45 Commits