icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-09-19 14:04:19 +00:00

Author	SHA1	Message	Date
Daniel Povey	691633b049	Merge branch 'scaled_adam_exp564' into scaled_adam_exp569	2022-12-01 16:25:53 +08:00
Daniel Povey	2102038e0e	Fix bug in diagnostics.py	2022-12-01 16:23:50 +08:00
Daniel Povey	d294449221	Fix typo 0.21->0.2	2022-12-01 15:29:46 +08:00
Daniel Povey	f0c46ce564	Double nonlin_skip_rate and have it last twice longer.	2022-12-01 15:28:44 +08:00
Daniel Povey	cac1a8b860	Merge branch 'scaled_adam_exp569' into scaled_adam_exp576	2022-12-01 15:20:20 +08:00
Daniel Povey	4621e924ba	Introduce dropout schedule for NonlinAttentionModule	2022-12-01 15:19:51 +08:00
Daniel Povey	dcf6fced40	Change whitening limit of NonlinAttentionModule from _whitening_schedule(7.5), to 5.0	2022-12-01 14:28:56 +08:00
Daniel Povey	025bcc155d	Change scale_min of Conv2dSubsampling from .01 to .1; some cosmetic changes/unimportant bugfixes.	2022-12-01 14:20:15 +08:00
Daniel Povey	ba31272c92	Change sigmoid to tanh in NonlinAttentionModule, and adjust abs limits of balancer to compensate.	2022-11-30 21:44:45 +08:00
Daniel Povey	d682ecc246	Introduce alpha for DoubleSwish, set it to -0.05.	2022-11-30 18:58:25 +08:00
Daniel Povey	2969eb5467	Fix diagnostics bug	2022-11-30 16:52:21 +08:00
Daniel Povey	b79a794706	Fix bug in diagnostics RE gpu	2022-11-30 16:02:18 +08:00
Daniel Povey	b7cad258bb	Draft of new diagnostics for activations	2022-11-30 15:57:24 +08:00
Daniel Povey	c75c2dc91d	reduce min_abs of zipformer-encoder-layer balancer from 0.4 to 0.25.	2022-11-30 13:40:53 +08:00
Daniel Povey	d100aed58b	Revert "Reduce min of scale in Conv2dSubsampling from 0.01 to 0.2" This reverts commit 7589e3768975df10c3d022beb4c88f14c2f25d3d.	2022-11-30 13:17:20 +08:00
Daniel Povey	12e8c3f0fa	One more layer on input	2022-11-29 16:47:24 +08:00
Daniel Povey	640d48262f	Double scale on aux_loss	2022-11-29 16:20:21 +08:00
Daniel Povey	7589e37689	Reduce min of scale in Conv2dSubsampling from 0.01 to 0.2	2022-11-29 16:18:41 +08:00
Daniel Povey	441fcf063d	Reduce final value of bypass_min from 0.25 to 0.2	2022-11-29 16:15:34 +08:00
Daniel Povey	73e420865c	Revert min_abs in NonlinAttentionModule from 2.0 to 1.5	2022-11-29 15:53:29 +08:00
Daniel Povey	f48786534a	Merge branch 'scaled_adam_exp535' into scaled_adam_exp548	2022-11-29 15:43:44 +08:00
Daniel Povey	28c5923986	Remove max_factor=0.02 option in bottleneck balancer of class AttentionSqueeze, change its min,max positive to 0.2,0.8	2022-11-29 15:43:25 +08:00
Daniel Povey	5632782ee1	Merge branch 'scaled_adam_exp539' into scaled_adam_exp548	2022-11-29 15:40:23 +08:00
Daniel Povey	b90d8aabde	Revert the alternate-layers-only thing for nonlin_attention and attention_squeeze	2022-11-29 15:38:55 +08:00
Daniel Povey	753269668a	Change ratio in NonlinAttentionModule from 8 to 2	2022-11-29 15:38:13 +08:00
Daniel Povey	93942725c4	Increase min_abs of balancer of encoder layer from 0.2 to 0.4.	2022-11-29 13:46:47 +08:00
Daniel Povey	36a2f33a6f	Have value dim in NonlinAttentionModule be half of num_channels	2022-11-28 21:55:06 +08:00
Daniel Povey	258d4f1353	Let ratio be 8, not 2, for sigmoid in NonlinAttentionModule	2022-11-28 21:51:29 +08:00
Daniel Povey	7018c722b5	Let ratio of values to sigmoids be 8, not 2	2022-11-28 21:50:11 +08:00
Daniel Povey	643c547eec	Double just the value dim in NonlinAttentionLayer.	2022-11-28 20:56:47 +08:00
Daniel Povey	88bc45d596	Halve scale on aux_loss	2022-11-28 16:37:46 +08:00
Daniel Povey	cee62c823d	have final prob of aux_loss for input projections be 0	2022-11-28 16:36:17 +08:00
Daniel Povey	9cf5d92f39	Have nonlin_attention and attention_squeeze operate only on every other layer.	2022-11-28 16:24:24 +08:00
Daniel Povey	87ef4078d3	Add two more layers.	2022-11-28 13:56:40 +08:00
Daniel Povey	f483f1e0ef	Implement attention weights sharing for successive layers, for Zipformer	2022-11-28 13:41:11 +08:00
Daniel Povey	2a289d38b7	Make max_abs for feedforward module be a constant at 15.0	2022-11-28 13:19:37 +08:00
Daniel Povey	27a12a982b	Increase min_abs and max_abs in feedforward module.	2022-11-28 12:52:28 +08:00
Daniel Povey	121f7e2a45	Documentation fix.	2022-11-28 12:10:08 +08:00
Daniel Povey	c6d859dd05	Increase min_abs of balancer in NonlinAttentionModule from 1.5 to 2.0.	2022-11-28 11:35:00 +08:00
Daniel Povey	39ce60bb7c	Decrease final value of max_abs in AttentionSqueeze from 5.0 to 1.0	2022-11-28 10:45:53 +08:00
Daniel Povey	0bfd81d721	fix bug RE dims_to_mean	2022-11-28 10:42:06 +08:00
Daniel Povey	109825cafb	Fix problem with mean offset in LinearWithAuxLoss.	2022-11-28 09:46:01 +08:00
Daniel Povey	a3b07fd098	Double aux_grad scale	2022-11-28 00:19:03 +08:00
Daniel Povey	9752778ee6	Use the same schedule for in_proj as out_proj. Only affects a couple of modules.	2022-11-28 00:09:26 +08:00
Daniel Povey	9e7add6be8	Work out alpha (scale on z) in LinearWithAuxLossFunction	2022-11-27 23:48:26 +08:00
Daniel Povey	0307252832	Bug fix	2022-11-27 21:33:37 +08:00
Daniel Povey	5128ff8797	Changes to balancer min_abs/max_abs limits.	2022-11-27 21:14:41 +08:00
Daniel Povey	a610011c3c	Partially revert sign_gain_factor	2022-11-27 17:18:33 +08:00
Daniel Povey	30d0bc6ad7	Make gain factor 4 times larger, for constraining the sign in ActivationBalancer.	2022-11-27 17:17:11 +08:00
Daniel Povey	785a524341	Increase in_abs of hidden balancer of ff modules from 0.2 to 1.0	2022-11-27 17:06:31 +08:00

... 6 7 8 9 10 ...

1805 Commits