icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Daniel Povey	cd4730b657	Try to refactor the code for scheduling	2022-11-14 12:50:24 +08:00
Daniel Povey	e4a3b2da7d	Mostly-cosmetic fixes found via mypy	2022-11-09 17:40:09 +08:00
Daniel Povey	e08f5c1bce	Replace Pooling module with ModifiedSEModule	2022-11-01 14:38:06 +08:00
Daniel Povey	a067fe8026	Fix clamping of epsilon	2022-10-28 12:50:14 +08:00
Daniel Povey	7b8a0108ea	Merge branch 'scaled_adam_exp188' into scaled_adam_exp198b	2022-10-28 12:49:36 +08:00
Daniel Povey	b9f6ba1aa2	Remove some unused variables.	2022-10-28 12:01:45 +08:00
Daniel Povey	bf37c7ca85	Regularize how we apply the min and max to the eps of BasicNorm	2022-10-26 12:51:20 +08:00
Daniel Povey	78f3cba58c	Add logging about memory used.	2022-10-25 19:19:33 +08:00
Daniel Povey	6a6df19bde	Hopefully make penalize_abs_values_gt more memory efficient.	2022-10-25 18:41:33 +08:00
Daniel Povey	dbfbd8016b	Cast to float16 in DoubleSwish forward	2022-10-25 13:16:00 +08:00
Daniel Povey	36cb279318	More memory efficient backprop for DoubleSwish.	2022-10-25 12:21:22 +08:00
Daniel Povey	95aaa4a8d2	Store only half precision output for softmax.	2022-10-23 21:24:46 +08:00
Daniel Povey	d3876e32c4	Make it use float16 if in amp but use clamp to avoid wrapping error	2022-10-23 21:13:23 +08:00
Daniel Povey	85657946bb	Try a more exact way to round to uint8 that should prevent ever wrapping around to zero	2022-10-23 20:56:26 +08:00
Daniel Povey	d6aa386552	Fix randn to rand	2022-10-23 17:19:19 +08:00
Daniel Povey	e586cc319c	Change the discretization of the sigmoid to be expectation preserving.	2022-10-23 17:11:35 +08:00
Daniel Povey	09cbc9fdab	Save some memory in the autograd of DoubleSwish.	2022-10-23 16:59:43 +08:00
Daniel Povey	b7083e7aff	Increase default max_factor for ActivationBalancer from 0.02 to 0.04; decrease max_abs in ConvolutionModule.deriv_balancer2 from 100.0 to 20.0	2022-10-23 00:09:21 +08:00
Daniel Povey	e0c1dc66da	Increase probs of activation balancer and make it decay slower.	2022-10-22 22:18:38 +08:00
Daniel Povey	84580ec022	Configuration changes: scores limit 5->10, min_prob 0.05->0.1, cur_grad_scale more aggressive increase	2022-10-22 14:09:53 +08:00
Daniel Povey	9672dffac2	Merge branch 'scaled_adam_exp168' into scaled_adam_exp169	2022-10-22 14:05:07 +08:00
Daniel Povey	bdbd2cfce6	Penalize too large weights in softmax of AttentionDownsample()	2022-10-21 20:12:36 +08:00
Daniel Povey	476fb9e9f3	Reduce min_prob of ActivationBalancer from 0.1 to 0.05.	2022-10-21 15:42:04 +08:00
Daniel Povey	6e6209419c	Merge branch 'scaled_adam_exp150' into scaled_adam_exp155 # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless7/conformer.py	2022-10-20 15:04:27 +08:00
Daniel Povey	4565d43d5c	Add hard limit of attention weights to +- 50	2022-10-20 14:28:22 +08:00
Daniel Povey	6601035db1	Reduce min_abs from 1.0e-04 to 5.0e-06	2022-10-20 13:53:10 +08:00
Daniel Povey	5a0914fdcf	Merge branch 'scaled_adam_exp149' into scaled_adam_exp150	2022-10-20 13:31:22 +08:00
Daniel Povey	679ba2ee5e	Remove debug print	2022-10-20 13:30:55 +08:00
Daniel Povey	610281eaa2	Keep just the RandomGrad changes, vs. 149. Git history may not reflect real changes.	2022-10-20 13:28:50 +08:00
Daniel Povey	d137118484	Get the randomized backprop for softmax in autocast mode working.	2022-10-20 13:23:48 +08:00
Daniel Povey	d75d646dc4	Merge branch 'scaled_adam_exp147' into scaled_adam_exp149	2022-10-20 12:59:50 +08:00
Daniel Povey	f6b8f0f631	Fix bug in backprop of random_clamp()	2022-10-20 12:49:29 +08:00
Daniel Povey	f08a869769	Merge branch 'scaled_adam_exp151' into scaled_adam_exp150	2022-10-19 19:59:07 +08:00
Daniel Povey	cc15552510	Use full precision to do softmax and store ans.	2022-10-19 19:53:53 +08:00
Daniel Povey	a4443efa95	Add RandomGrad with min_abs=1.0e-04	2022-10-19 19:46:17 +08:00
Daniel Povey	0ad4462632	Reduce min_abs from 1e-03 to 1e-04	2022-10-19 19:27:28 +08:00
Daniel Povey	ef5a27388f	Merge branch 'scaled_adam_exp146' into scaled_adam_exp149	2022-10-19 19:16:27 +08:00
Daniel Povey	9c54906e63	Implement randomized backprop for softmax.	2022-10-19 19:16:03 +08:00
Daniel Povey	f4442de1c4	Add reflect=0.1 to invocations of random_clamp()	2022-10-19 12:34:26 +08:00
Daniel Povey	c3c655d0bd	Random clip attention scores to -5..5.	2022-10-19 11:59:24 +08:00
Daniel Povey	6b3f9e5036	Changes to avoid bug in backward hooks, affecting diagnostics.	2022-10-19 11:06:17 +08:00
Daniel Povey	1135669e93	Bug fix RE float16	2022-10-16 10:58:22 +08:00
Daniel Povey	fc728f2738	Reorganize Whiten() code; configs are not the same as before. Also remove MaxEig for self_attn module	2022-10-15 23:20:18 +08:00
Daniel Povey	96023419da	Reworking of ActivationBalancer code to hopefully balance speed and effectiveness.	2022-10-14 19:20:32 +08:00
Daniel Povey	5f375be159	Merge branch 'scaled_adam_exp103b2' into scaled_adam_exp103b4	2022-10-14 15:27:10 +08:00
Daniel Povey	15b91c12d6	Reduce stats period from 10 to 4.	2022-10-14 15:14:06 +08:00
Daniel Povey	db8b9919da	Reduce beta from 0.75 to 0.0.	2022-10-14 15:12:59 +08:00
Daniel Povey	23d6bf7765	Fix bug when channel_dim < 0	2022-10-13 13:52:28 +08:00
Daniel Povey	49c6b6943d	Change scale_factor_scale from 0.5 to 0.8	2022-10-12 20:55:52 +08:00
Daniel Povey	b736bb4840	Cosmetic improvements	2022-10-12 19:34:48 +08:00

1 2

67 Commits