icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-09-19 14:04:19 +00:00

Author	SHA1	Message	Date
Daniel Povey	de1fd91435	Adding max_abs=3.0 to ActivationBalancer modules inside feedoforward modules.	2022-07-16 07:19:26 +08:00
Daniel Povey	23e6d2e6d8	Fix to the fix	2022-07-16 06:53:44 +08:00
Daniel Povey	4c8d77d14a	Fix return type	2022-07-15 14:18:07 +08:00
Daniel Povey	68c5935691	Fix bug re param_cov freshness, properly.	2022-07-15 08:33:10 +08:00
Daniel Povey	b6ee698278	Make LR update period less frequent later in training; fix bug with param_cov freshness, was too fresh	2022-07-15 07:59:30 +08:00
Daniel Povey	689441b237	Reduce param_pow from 0.75 to 0.5	2022-07-14 06:08:06 +08:00
Daniel Povey	7f6fe02db9	Fix formula for smoothing (was applying more smoothing than intended, and in the opposite sense to intended), also revert max_rms from 2.0 to 4.0	2022-07-14 06:06:02 +08:00
Daniel Povey	4785245e5c	Reduce debug freq	2022-07-13 06:51:23 +08:00
Daniel Povey	d48fe0b99c	Change max rms from 10.0 to 4.0	2022-07-13 05:53:35 +08:00
Daniel Povey	cedfb5a377	Make max eig ratio 10	2022-07-12 13:59:58 +08:00
Daniel Povey	278358bb9f	Remove debug code	2022-07-12 08:39:14 +08:00
Daniel Povey	8c44ff26f7	Fix bug in batching code for scalars	2022-07-12 08:36:45 +08:00
Daniel Povey	25cb8308d5	Add max_block_size=512 to PrAdam	2022-07-12 08:35:14 +08:00
Daniel Povey	41df045773	Simplify formula, getting rid of scalar_exp_avg_sq	2022-07-11 17:14:12 -07:00
Daniel Povey	4f0e219523	Bug fix to reproduce past results with max_block_size unset.	2022-07-11 17:03:32 -07:00
Daniel Povey	075a2e27d8	Replace max_fullcov_size with max_block_size	2022-07-11 16:37:01 -07:00
Daniel Povey	3468c3aa5a	Remove ActivationBalancer, unnecessary	2022-07-11 14:12:24 -07:00
Daniel Povey	7993c84cd6	Apparently working version, with changed test-code topology	2022-07-11 13:17:29 -07:00
Daniel Povey	245d39b1bb	Still debugging but close to done	2022-07-11 00:33:37 -07:00
Daniel Povey	27da50a1f6	Committing partial work..	2022-07-10 15:46:32 -07:00
Daniel Povey	d25df4af5e	Slight refactoring, preparing for batching.	2022-07-09 22:24:36 -07:00
Daniel Povey	d9a6180ae0	Bug fix	2022-07-10 10:20:39 +08:00
Daniel Povey	b7035844a2	Introduce scalar_max, stop eps getting large or small	2022-07-10 10:13:55 +08:00
Daniel Povey	2f73434541	Reduce debug frequency	2022-07-10 06:44:50 +08:00
Daniel Povey	b3bb2dac6f	Iterative, more principled way of estimating param_cov	2022-07-10 06:28:01 +08:00
Daniel Povey	d139c18f22	Max eig of Q limited to 5 times the mean	2022-07-09 14:30:03 +08:00
Daniel Povey	ffeef4ede4	Remove rank-1 dims, meaning where size==numel(), from processing.	2022-07-09 13:36:48 +08:00
Daniel Povey	2fc9eb9789	Respect param_pow	2022-07-09 12:49:04 +08:00
Daniel Povey	209acaf6e4	Increase lr_update_period to 200. The update takes about 2 minutes, fore entire model.	2022-07-09 11:36:54 +08:00
Daniel Povey	61cab3ab65	introduce grad_cov_period	2022-07-09 10:29:23 +08:00
Daniel Povey	35a51bc153	Reduce debug probs	2022-07-09 10:22:19 +08:00
Daniel Povey	65bc964854	Fix bug for scalar update	2022-07-09 10:14:20 +08:00
Daniel Povey	aa2237a793	Bug fix	2022-07-09 10:11:54 +08:00
Daniel Povey	50ee414486	Fix train.py for new optimizer	2022-07-09 10:09:53 +08:00
Daniel Povey	6810849058	Implement new version of learning method. Does more complete diagonalization of grads than the previous methods.	2022-07-09 10:02:17 +08:00
Daniel Povey	a9edecd32c	Conformed that symmetrizing helps because of interaction with regular update; still meta_lr_scale=0 best :-(	2022-07-09 05:20:04 +08:00
Daniel Povey	52bfb2b018	This works better for reasons I dont understand. transpose is enough, same as symmetrizing.	2022-07-08 11:53:59 +08:00
Daniel Povey	e9ab1ddd39	Inconseqeuential config change	2022-07-08 11:03:16 +08:00
Daniel Povey	be6680e3ba	Couple configuration changes, comment simplification	2022-07-08 09:46:42 +08:00
Daniel Povey	75e872ea57	Fix bug in getting denom in proj update	2022-07-08 09:13:54 +08:00
Daniel Povey	914ac1e621	Works better with meta_lr_scale=0, must be bug.	2022-07-08 09:07:06 +08:00
Daniel Povey	923468b8af	Deal with SVD failure better.	2022-07-08 09:00:12 +08:00
Daniel Povey	97feb8a3ec	Reduce meta_lr_scale, reduces loss @140 from 1.4 to 0.39	2022-07-08 06:33:07 +08:00
Daniel Povey	b6199a71e9	Introduce delta_scale to slow down changes on M; significantly better.	2022-07-08 06:05:31 +08:00
Daniel Povey	ceb9815f2b	Increase lr_est_period	2022-07-08 05:51:18 +08:00
Daniel Povey	fb36712e6b	Another bug fix, regarding Q being transposed.	2022-07-08 05:22:24 +08:00
Daniel Povey	ad2e698fc3	Cleanups	2022-07-08 04:44:21 +08:00
Daniel Povey	04d2e10b4f	Version that runs	2022-07-08 04:37:46 +08:00
Daniel Povey	e6d00ee3e4	More drafts of new method, not tested.	2022-07-06 23:05:06 -07:00
Daniel Povey	26815d177f	Draft of the new method..	2022-07-06 22:59:36 -07:00

1 2 3 4 5 ...

748 Commits