icefall

Author	SHA1	Message	Date
yaozengwei	42800f775e	remove score sorting in test mode	2023-08-02 19:26:48 +08:00
Daniel Povey	74bf02bba6	Load num_tokens_seen from disk on checkpoint load.	2023-06-20 02:54:47 +08:00
Daniel Povey	b3b3e5daa0	Pad only on the right	2023-06-20 01:58:27 +08:00
Daniel Povey	85b6450a8a	Remove old code	2023-06-19 07:45:57 +08:00
Daniel Povey	6c3ab1e706	Fixes	2023-06-19 04:59:57 +08:00
Daniel Povey	03ad0d7910	Remove concept of epochs from training subformer for language modeling; revert dimensions to how they were in zlm53.	2023-06-19 04:45:37 +08:00
Daniel Povey	c7e8a7349d	Increase dim of middle satck from 512 to 768	2023-06-19 02:16:59 +08:00
Daniel Povey	01ed3bbcc4	Make encoder dims mostly 512.	2023-06-18 04:02:21 +08:00
Daniel Povey	b656b0df36	Changes that should affect nothing: bug fixes etc.	2023-06-18 04:00:43 +08:00
Daniel Povey	70bd58c648	Fix print_diagnostics break statement	2023-06-18 03:55:13 +08:00
Daniel Povey	e9668a5cfd	Fix break in fix_diagnostics mode	2023-06-18 03:36:13 +08:00
Daniel Povey	5a8cabd429	Fix max_eig arg to TensorDiagnosticsOptions	2023-06-18 03:28:57 +08:00
Daniel Povey	7d7fc45ab2	Revert model-size changes	2023-05-30 14:49:42 +08:00
Daniel Povey	d0309c3f3d	Increase penalty cutoff in NonlinAttention to 40.	2023-05-29 23:02:59 +08:00
Daniel Povey	09294c0b51	Merge branch 'zlm51' into zlm52	2023-05-29 20:01:27 +08:00
Daniel Povey	265e190946	Penalize large values in NonlinAttentionModule	2023-05-29 19:17:47 +08:00
Daniel Povey	e313674dc7	Reduce batch size to 15	2023-05-29 17:38:11 +08:00
Daniel Povey	5fbbeb1d29	Try batch size of 16	2023-05-29 17:34:00 +08:00
Daniel Povey	cd36d149df	Reduce encoder-dim and num-heads of center stack.	2023-05-29 17:32:49 +08:00
Daniel Povey	cdd9cf695f	Fix bug regarding --start-batch option	2023-05-29 16:41:54 +08:00
Daniel Povey	cbd59b9c68	Don't skip penalize_abs_values_gt due to memory cutoff; remove grad_scale=0.1	2023-05-29 16:29:48 +08:00
Daniel Povey	7fdd125ba9	Merge branch 'zlm50' into zlm51	2023-05-29 13:54:53 +08:00
Daniel Povey	f05f1a6353	Increase grad_scale and prob in score_balancer	2023-05-29 13:20:07 +08:00
Daniel Povey	0f27b14376	Support unbalanced structures	2023-05-29 13:13:29 +08:00
Daniel Povey	b85012aa0b	Merge branch 'zlm49' into zlm51	2023-05-29 12:20:43 +08:00
Daniel Povey	42f3ad0a11	Remove grad_scale=0.1	2023-05-29 11:55:18 +08:00
Daniel Povey	16e51a7deb	remove find_unused_parameters=True and use bypass module	2023-05-29 11:54:21 +08:00
Daniel Povey	38246c8690	Revert "find_unused_parameters=True removed" This reverts commit ba337f8554c2b0b7e0ab3462027de59862cb95dc.	2023-05-29 11:51:09 +08:00
Daniel Povey	ba337f8554	find_unused_parameters=True removed	2023-05-29 11:47:03 +08:00
Daniel Povey	d975d59c7d	remove bypass_scale	2023-05-29 11:46:18 +08:00
Daniel Povey	d950496d5a	Increase grad_scale in score_balancer	2023-05-29 10:56:01 +08:00
Daniel Povey	79f1863a1e	Fix SoftmaxFunction bug	2023-05-29 10:55:03 +08:00
Daniel Povey	137ac513bf	Some changes to try to reduce mem consumption; decrease batch size	2023-05-28 21:50:34 +08:00
Daniel Povey	625e39fd1a	Avoid penalize_abs_values_gt when memory usage high	2023-05-28 20:40:47 +08:00
Daniel Povey	815cc1ba4f	Add another middle stack; batch size 18->16.	2023-05-28 20:23:30 +08:00
Daniel Povey	bc55fb96eb	Set final skip/bypass rates to zero	2023-05-28 16:30:28 +08:00
Daniel Povey	d045ef7ce7	Change default lr from 0.025 to 0.035	2023-05-28 15:42:54 +08:00
Daniel Povey	da80241179	Use larger valid set; get --print-diagnostics=True to work	2023-05-28 15:17:09 +08:00
Daniel Povey	105fb56db4	Make base-lr default 0.025	2023-05-24 16:30:23 +08:00
Daniel Povey	8483ca2e8f	More partial work	2023-05-24 16:04:05 +08:00
Daniel Povey	e51a2c9170	Partial work	2023-05-23 14:01:04 +08:00
Daniel Povey	bcc9971ebe	Add clip_grad	2023-05-23 14:00:56 +08:00
Daniel Povey	3351402875	Implement train mode in lm_datamodule	2023-05-23 11:08:05 +08:00
Daniel Povey	3a71a53d8d	Set lr_factor on to_scores, max_abs=4.0 on balancer	2023-05-23 10:56:03 +08:00
Daniel Povey	45043e2e21	Merge branch 'zlm25' into zlm26	2023-05-20 22:24:15 +08:00
Daniel Povey	8dc070ce37	Increase all ff dims; decrease batch size.	2023-05-20 13:35:23 +08:00
Daniel Povey	c1de4cc847	Remove factor of 2 in weights_discarded	2023-05-19 20:13:12 +08:00
Daniel Povey	4a425f7eb5	Half the time, flip weights_discarded	2023-05-19 18:04:05 +08:00
Daniel Povey	7d162bf41e	mOve where srand called	2023-05-19 16:43:21 +08:00
Daniel Povey	f37ec0f0da	Include start batch in seed	2023-05-19 16:39:13 +08:00

1 2 3 4 5 ...

1894 Commits