icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-09-19 14:04:19 +00:00

Author	SHA1	Message	Date
Daniel Povey	80b2c751e3	Merge branch 'scaled_adam_exp896' into scaled_adam_exp904	2023-01-16 13:18:42 +08:00
Daniel Povey	ed65330261	RemoveAttentionSqueeze	2023-01-16 13:18:29 +08:00
Daniel Povey	fb30d11693	Merge branch 'scaled_adam_exp891' into scaled_adam_exp896	2023-01-15 12:52:41 +08:00
Daniel Povey	048b6b6259	Make scale in NonlinAttention have glu nonlinearity.	2023-01-15 00:21:01 +08:00
Daniel Povey	eeadc3b0cc	Add a multiplication to NonlinAttentionModule	2023-01-14 20:41:30 +08:00
Daniel Povey	4fe91ce67c	Double hidden_channels in NonlinAttention from embed_dim//4 to embed_dim//2.	2023-01-14 17:19:34 +08:00
Daniel Povey	ec8804283c	Try to make SmallConvolutionModule more efficient	2023-01-14 14:54:46 +08:00
Daniel Povey	167b58baa0	Make output dim of Zipformer be max dim	2023-01-14 14:29:29 +08:00
Daniel Povey	fb7a967276	Increase unmasked dims	2023-01-13 17:38:11 +08:00
Daniel Povey	bebc27f274	Increasing encoder-dim of some layers, and unmasked-dim	2023-01-13 17:36:45 +08:00
Daniel Povey	e6af583ee1	Increase encoder-dim of slowest stack from 320 to 384	2023-01-13 14:40:42 +08:00
Daniel Povey	a88587dc8a	Fix comment; have 6, not 4, layers in most-downsampled stack.	2023-01-13 00:12:46 +08:00
Daniel Povey	5958f1ee11	Remove memory-allocated printouts	2023-01-12 22:14:52 +08:00
Daniel Povey	bac72718f0	Bug fixes, config changes	2023-01-12 22:11:42 +08:00
Daniel Povey	d3b3592986	Fix bug to allow down+up sampling	2023-01-12 21:18:34 +08:00
Daniel Povey	1e04c3d892	Reduce dimension for speed, have varying dims	2023-01-12 21:15:39 +08:00
Daniel Povey	9e4b84f374	Simplify Conv2dSubsampling, removing all but one ConvNext layer	2023-01-12 20:14:51 +08:00
Daniel Povey	65f15c9d14	Reduce final_layerdrop_rate coefficient.	2023-01-12 20:00:49 +08:00
Daniel Povey	3fdfec1049	Replace dropout2 on Conv2dSubsampling with Dropout3, share time dim	2023-01-11 13:18:08 +08:00
Daniel Povey	1774853bdf	Remove caching eval	2023-01-11 13:12:25 +08:00
Daniel Povey	1580c1c1cc	Fix MulForDropout3	2023-01-11 12:26:41 +08:00
Daniel Povey	8bbcd81604	Memory efficient backprop for dropout3	2023-01-10 17:46:32 +08:00
Daniel Povey	4033000730	Share dropout masks across time in ff modules	2023-01-10 17:12:32 +08:00
Daniel Povey	3110ed045a	Increase base final_layerdrop_rate from 0.035 to 0.05	2023-01-09 23:32:36 +08:00
Daniel Povey	1d40239d69	Merge branch 'scaled_adam_exp872' into scaled_adam_exp873	2023-01-09 14:52:48 +08:00
Daniel Povey	e739d8aa38	Fix layer_skip_rate so it's actually used, increase its value.	2023-01-09 13:34:32 +08:00
Daniel Povey	1a0155fcb5	Merge branch 'scaled_adam_exp863' into scaled_adam_exp870 # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless7/scaling.py	2023-01-08 23:36:29 +08:00
Daniel Povey	326cb75033	Increase layer_skip_rate slightly	2023-01-08 15:48:23 +08:00
Daniel Povey	62b42887b4	Revert zipformer.py to status on previous commit	2023-01-08 13:17:39 +08:00
Daniel Povey	e952598677	Merge branch 'scaled_adam_exp846' into scaled_adam_exp866	2023-01-08 13:16:24 +08:00
Daniel Povey	117db124d0	Implement higher layerdrop for central stacks	2023-01-08 13:16:10 +08:00
Daniel Povey	c7107ead64	Fix bug in get_adjusted_batch_count	2023-01-07 17:45:22 +08:00
Daniel Povey	b3527fe4ac	Implement caching evaluation for ConvNeXt	2023-01-07 17:31:20 +08:00
Daniel Povey	9242800d42	Remove the 8x-subsampled stack	2023-01-07 12:59:57 +08:00
Daniel Povey	ef48019d6e	Reduce feedforward-dims	2023-01-06 22:26:58 +08:00
Daniel Povey	9b0c0aabb2	Merge branch 'scaled_adam_exp829' into scaled_adam_exp860 # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py	2023-01-06 22:24:45 +08:00
Daniel Povey	6a762914bf	Increase base-lr from 0.05 t to 0.055	2023-01-06 13:35:57 +08:00
Daniel Povey	5564a0efb0	Further tune lr scales; increase base-lr	2023-01-06 13:34:48 +08:00
Daniel Povey	f6f088489d	Adjust lr_scales, make them closer to 1.	2023-01-05 23:49:42 +08:00
Daniel Povey	ccc38a97f7	Reduce lr_scales of soem sub modules	2023-01-05 18:50:04 +08:00
Daniel Povey	90c02b471c	Revert base LR to 0.05	2023-01-05 16:27:43 +08:00
Daniel Povey	067b861c70	Use largest LR for printing	2023-01-05 14:46:15 +08:00
Daniel Povey	6c7fd8c046	Increase base-lr to 0.06	2023-01-05 14:23:59 +08:00
Daniel Povey	95e8296014	Use downsampling_factor ** -0.333 as the scale for stacks	2023-01-05 14:23:40 +08:00
Daniel Povey	0d7161ebec	Use get_parameter_groups_with_lr in train.py; bug fixes	2023-01-05 14:11:33 +08:00
Daniel Povey	1db509ea31	Attempt to implement slower learning for downsampled modules	2023-01-05 13:39:22 +08:00
Daniel Povey	b7be18c2f8	Keep only needed changes from Liyong's branch	2023-01-05 12:23:32 +08:00
Daniel Povey	096ebeaf23	take a couple files from liyong's branch	2023-01-05 12:01:42 +08:00
Daniel Povey	22b4a417dd	Implement extra_layerdrop	2023-01-04 20:59:58 +08:00
Daniel Povey	b973929d7c	Bug fixes to ScheduledFloat	2023-01-04 20:54:05 +08:00

... 2 3 4 5 6 ...

1874 Commits