icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Daniel Povey	2e66392306	Change warmup schedule	2023-05-15 20:20:15 +08:00
Daniel Povey	86c2c60100	Step lr_scheduler on tokens not epoch; add some more debug output	2023-05-04 15:35:22 +08:00
Daniel Povey	4e36656cef	Remove import that is no longer there	2023-03-10 14:45:02 +08:00
Daniel Povey	0d7161ebec	Use get_parameter_groups_with_lr in train.py; bug fixes	2023-01-05 14:11:33 +08:00
Daniel Povey	1db509ea31	Attempt to implement slower learning for downsampled modules	2023-01-05 13:39:22 +08:00
Daniel Povey	b7be18c2f8	Keep only needed changes from Liyong's branch	2023-01-05 12:23:32 +08:00
Daniel Povey	096ebeaf23	take a couple files from liyong's branch	2023-01-05 12:01:42 +08:00
Daniel Povey	8056e0f9af	Make sure param_rms limit is effectively applied; fix tests in optim.py	2022-12-29 23:55:16 +08:00
Daniel Povey	28cac1c2dc	Merge debugging changes to optimizer.	2022-12-20 13:01:50 +08:00
Daniel Povey	bf37c7ca85	Regularize how we apply the min and max to the eps of BasicNorm	2022-10-26 12:51:20 +08:00
Daniel Povey	a0507a83a5	Change scalar_max in optim.py from 2.0 to 5.0	2022-10-25 22:58:07 +08:00
Daniel Povey	146626bb85	Renaming in optim.py; remove step() from scan_pessimistic_batches_for_oom in train.py	2022-10-22 17:44:21 +08:00
Daniel Povey	af0fc31c78	Introduce warmup schedule in optimizer	2022-10-22 15:15:43 +08:00
Daniel Povey	1ec9fe5c98	Make warmup period decrease scale on simple loss, leaving pruned loss scale constant.	2022-10-22 14:48:53 +08:00
Daniel Povey	efde3757c7	Reset optimizer state when we change loss function definition.	2022-10-22 14:30:18 +08:00
Daniel Povey	857b3735e7	Fix bug where fewer layers were dropped than should be; remove unnecesary print statement.	2022-10-10 13:18:40 +08:00
Daniel Povey	dece8ad204	Various fixes from debugging with nvtx, but removed the NVTX annotations.	2022-10-09 21:14:52 +08:00
Daniel Povey	bd7dce460b	Reintroduce batching to the optimizer	2022-10-09 20:29:23 +08:00
Daniel Povey	24aea947d2	Fix issues where grad is None, and unused-grad cases	2022-09-22 19:18:16 +08:00
Daniel Povey	03a77f8ae5	Merge branch 'scaled_adam_exp7c' into scaled_adam_exp11c	2022-09-22 18:15:44 +08:00
Daniel Povey	e2fdfe990c	Loosen limit on param_max_rms, from 2.0 to 3.0; change how param_min_rms is applied.	2022-09-20 15:20:43 +08:00
Daniel Povey	3d72a65de8	Implement max-eig-proportion..	2022-09-19 10:26:37 +08:00
Daniel Povey	69404f61ef	Use scalar_lr_scale for scalars as well as sizes.	2022-09-18 14:12:27 +08:00
Daniel Povey	bb1bee4a7b	Improve how quartiles are printed	2022-09-16 17:30:03 +08:00
Daniel Povey	8298333bd2	Implement gradient clipping.	2022-09-16 16:52:46 +08:00
Daniel Povey	8f876b3f54	Remove batching from ScaledAdam, in preparation to add gradient norm clipping	2022-09-16 15:42:56 +08:00
Daniel Povey	257c961b66	1st attempt at scaled_adam	2022-09-16 13:59:52 +08:00
Daniel Povey	276928655e	Merge branch 'pradam_exp1m8' into pradam_exp1m7s2	2022-08-24 04:17:30 +08:00
Daniel Povey	64f7166545	Some cleanups	2022-08-18 07:03:50 +08:00
Daniel Povey	5c33899ddc	Increase cov_min[3] from 0.001 to 0.002	2022-08-06 16:28:02 +08:00
Daniel Povey	9bbf8ada57	Scale up diag of grad_cov by 1.0001 prior to diagonalizing it.	2022-08-06 07:06:23 +08:00
Daniel Povey	c021b4fec6	Increase cov_min[3] from 0.0001 to 0.001	2022-08-06 07:02:52 +08:00
Daniel Povey	a5b9b7b974	Cosmetic changes	2022-08-05 03:51:00 +08:00
Daniel Povey	dc9133227f	Reworked how inverse is done, fixed bug in _apply_min_max_with_metric, regarding how M is normalized.	2022-08-04 09:46:14 +08:00
Daniel Povey	766bf69a98	Reduce cov_max[2] from 4.0 to 3.5	2022-08-03 04:10:11 +08:00
Daniel Povey	129b28aa9b	Increase cov_min[2] from 0.05 to 0.1; decrease cov_max[2] from 5.0 to 4.0.	2022-08-02 15:17:24 +08:00
Daniel Povey	202752418a	Increase cov_min[2] from 0.02 to 0.05.	2022-08-02 15:15:41 +08:00
Daniel Povey	e44ab25e99	Bug fix	2022-08-02 14:31:37 +08:00
Daniel Povey	e9f4ada1c0	Swap the order of applying min and max in smoothing operations	2022-08-02 11:55:43 +08:00
Daniel Povey	9473c7e23d	Lots of changes to how min and max are applied, use 1-norm for min in smooth_cov but not _apply_min_max_with_metric.	2022-08-02 11:29:54 +08:00
Daniel Povey	6ab4cf615d	1st draft of new method of normalizing covs that uses normalization w.r.t. spectral 2-norm	2022-08-02 09:34:37 +08:00
Daniel Povey	4919134a94	Merge making hidden_dim an arg	2022-08-02 09:09:29 +08:00
Daniel Povey	c64bd5ebcd	Merge making hidden_dim an arg	2022-08-02 09:07:36 +08:00
Daniel Povey	b008340d83	Merge making hidden_dim an arg	2022-08-02 09:01:19 +08:00
Daniel Povey	9f2229edb5	Merge making hidden_dim an arg	2022-08-02 08:58:00 +08:00
Daniel Povey	a45f820e25	Merge making hidden_dim an arg	2022-08-02 08:56:36 +08:00
Daniel Povey	804f264ffd	Merge hidden_dim providing it as arg	2022-08-02 08:40:13 +08:00
Daniel Povey	ee311247ea	Decrease debugging freq	2022-08-01 03:55:18 +08:00
Daniel Povey	4c5d49c448	Some numerical improvements, and a fix to calculation of mean_eig in _apply_min_max_with_metric(), to average over blocks too.	2022-08-01 03:51:39 +08:00
Daniel Povey	e2cc09a8c6	Fix issue with max_eig formula; restore cov_min[1]=0.0025.	2022-07-31 18:29:44 +08:00

1 2 3 4 5

224 Commits