icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-09-20 06:24:21 +00:00

Author	SHA1	Message	Date
Daniel Povey	0504f705ec	Add Whiten module in NonlinAttentionModule	2022-11-21 18:19:52 +08:00
Daniel Povey	211e3af680	Remove changes in previous merge commit that did not relate to length_factor.	2022-11-21 14:32:05 +08:00
Daniel Povey	a6770657c8	Merge branch 'scaled_adam_exp445' into scaled_adam_exp450	2022-11-21 14:29:50 +08:00
Daniel Povey	836c72dd36	Changes and bug-fixes RE balancers; restore activation in AttentionSqueeze, remove in NonlinAttention.	2022-11-21 14:29:36 +08:00
Daniel Povey	9fe6add587	Fix to diagnostics.py (fix for max being doubled), from scaled_adam_exp446; small cosmetic fixes.	2022-11-21 14:00:55 +08:00
Daniel Povey	a10a0bce7d	Increase length_factor from 1.5 to 3.0.	2022-11-20 16:36:18 +08:00
Daniel Povey	cdfbbdded2	Refactoring, and change length_factor from 2.0 to 1.5.	2022-11-20 16:34:51 +08:00
Daniel Povey	a52ec3da28	Change feedforward dims: increase 1536->1792 for largest ff dim and move it one step later.	2022-11-20 14:24:41 +08:00
Daniel Povey	31b2a735b8	Move feedforward1 to the beginning, separating it from small_conv_module.	2022-11-20 13:17:39 +08:00
Daniel Povey	40c883343a	Merge branch 'scaled_adam_exp439' into scaled_adam_exp440	2022-11-20 13:08:00 +08:00
Daniel Povey	cf16c96edd	Merge branch 'scaled_adam_exp433' into scaled_adam_exp440	2022-11-20 13:07:35 +08:00
Daniel Povey	8b3303594c	Revert 419->420 change, regarding random shift in pos embedding	2022-11-20 13:07:20 +08:00
Daniel Povey	4e21db07f6	Remove activation in AttentionSqueeze; add balancers; fix bugs RE balancers.	2022-11-19 22:05:10 +08:00
Daniel Povey	d23fda7c5f	Multiply length_factor by 2.0.	2022-11-19 13:36:16 +08:00
Daniel Povey	b9871cc4f5	Merge branch 'scaled_adam_exp420' into scaled_adam_exp421	2022-11-18 14:54:36 +08:00
Daniel Povey	0601dd72fd	Bug-fix RE random shift	2022-11-18 14:53:03 +08:00
Daniel Povey	8a095c1cd1	Add SmallConvModule; decrease feedforward dims to keep about same num params.	2022-11-18 12:46:40 +08:00
Daniel Povey	f7c99ed1d1	Introduce random shift with stddev=1.0 into pos_emb	2022-11-18 12:06:12 +08:00
Daniel Povey	e9806950f5	Reduce pos-dim from 96 to 48.	2022-11-17 23:42:39 +08:00
Daniel Povey	8b50932d5a	Merge branch 'scaled_adam_exp416' into scaled_adam_exp418	2022-11-17 18:34:07 +08:00
Daniel Povey	e73ced1607	Bug fix in formula for pos embedding	2022-11-17 16:02:57 +08:00
Daniel Povey	48f32971f3	Reduce final pos_emb_skip rate from 0.075 to 0.0, and add dropout=0.15 for pos embedding module	2022-11-17 14:33:54 +08:00
Daniel Povey	27f8497fea	Reduce pos_dim from 128 to 96.	2022-11-17 10:39:36 +08:00
Daniel Povey	526b5e59a6	Increase pos-head-dim from 2 to 4.	2022-11-16 11:53:55 +08:00
Daniel Povey	fc74ff63fb	Remove one feedforward module and give params to the other 2.	2022-11-16 11:46:05 +08:00
Daniel Povey	3d47335ab6	Double the duration of layer skipping warmup, from 2k to 4k.	2022-11-16 11:41:48 +08:00
Daniel Povey	22a1401f36	Remove self_attn1 module	2022-11-16 11:37:08 +08:00
Daniel Povey	d542fa61ff	Double pos_dim from 64 to 128.	2022-11-16 11:35:25 +08:00
Daniel Povey	000af07a2a	Increase final pos_emb_skip rate from 0.05 to 0.075	2022-11-16 11:34:26 +08:00
Daniel Povey	6668814940	Increase pos_emb_skip_rate from 0.05 to 0.075.	2022-11-15 11:50:14 +08:00
Daniel Povey	f76075fd1a	Make pos_emb dropout rate be constant during training; also cosmetic changes	2022-11-15 11:42:12 +08:00
Daniel Povey	867556200f	Have zero dropout in the position embedding, but dropout the entire thing with twice the final prob.	2022-11-15 11:39:20 +08:00
Daniel Povey	380f773069	Merge branch 'scaled_adam_exp387' into scaled_adam_exp390	2022-11-15 11:35:54 +08:00
Daniel Povey	a1a4b715d9	Introduce a dropout schedule for the pos embedding, in training time. # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py	2022-11-15 11:28:50 +08:00
Daniel Povey	6ea1706e11	Fix potential/theoretical issue in backward of LimitParamValue	2022-11-14 23:31:00 +08:00
Daniel Povey	d1df919547	Cosmetic improvements	2022-11-14 23:26:33 +08:00
Daniel Povey	46bd93b792	Cosmetic fix	2022-11-14 23:17:20 +08:00
Daniel Povey	a680c7de2e	Make bypass_scale be a tensor.	2022-11-14 19:12:16 +08:00
Daniel Povey	ff6431ed0f	Implement limits on parameter values a different way.	2022-11-14 16:02:38 +08:00
Daniel Povey	ce4b50d094	Revert making the dropout of pos_emb independent across the batch.	2022-11-14 15:34:39 +08:00
Daniel Povey	804917837e	Remove pos_emb csales	2022-11-14 15:32:54 +08:00
Daniel Povey	ba69eb48fe	Remove pos_emb schedule	2022-11-14 15:31:56 +08:00
Daniel Povey	54048009db	Fix self.training condition	2022-11-14 15:15:24 +08:00
Daniel Povey	e1fb25262a	Refactorize the scheduling code a little	2022-11-14 14:52:27 +08:00
Daniel Povey	b32dec1119	Add printing capability	2022-11-14 14:16:28 +08:00
Daniel Povey	4c8575878a	Bug fix in ScheduledSampler	2022-11-14 13:52:14 +08:00
Daniel Povey	614b5b1a52	Treat batch_idx==0.0 separately to get scan_pessimistic_batches_for_oom() to work. should not affect results.	2022-11-14 13:20:31 +08:00
Daniel Povey	cde4ca27ee	Introduce a dropout schedule for the pos embedding, in training time.	2022-11-14 13:00:30 +08:00
Daniel Povey	cd4730b657	Try to refactor the code for scheduling	2022-11-14 12:50:24 +08:00
Daniel Povey	aa0b1a37cd	Change to valid interval for libri-100	2022-11-13 23:29:17 +08:00

... 10 11 12 13 14 ...

1892 Commits