1886 Commits

Author SHA1 Message Date
Daniel Povey
db543866d8 Remove unused debug statement. 2023-02-11 17:43:37 +08:00
Daniel Povey
8ccd061051 Fix bug where attn_mask was not passed in. 2023-02-11 17:31:21 +08:00
Daniel Povey
e9157535a4 Remove unused variable 2023-02-11 15:53:33 +08:00
Daniel Povey
4b27ffa911 Fix causal option in SmallConvolutionModule (unused) 2023-02-11 15:53:09 +08:00
Daniel Povey
49627c6251 Cosmetic fix to formula 2023-02-11 15:38:03 +08:00
Daniel Povey
3cb43c3c36 Fix issue in decode.py 2023-02-11 14:45:45 +08:00
Daniel Povey
329175c897 Change how chunk-size is specified 2023-02-11 14:35:31 +08:00
Daniel Povey
ad388890d9 Make most forms of sequence dropout be separate per sequence. 2023-02-10 16:34:01 +08:00
Daniel Povey
e7e7560bba Implement chunking 2023-02-10 15:02:29 +08:00
Daniel Povey
b2fb504aee Merge branch 'scaled_adam_exp912' into scaled_adam_exp994 2023-02-08 21:15:21 +08:00
Daniel Povey
659ca97001 Remove small_conv_module and make nonlin_attention_module slightly wider 2023-02-08 13:56:22 +08:00
Daniel Povey
b2303e02c5 Revert "Make scale in NonlinAttention have glu nonlinearity."
This reverts commit 048b6b6259a715c4b8225d493fdcd8df88e42b1f.
2023-01-18 11:27:57 +08:00
Daniel Povey
80b2c751e3 Merge branch 'scaled_adam_exp896' into scaled_adam_exp904 2023-01-16 13:18:42 +08:00
Daniel Povey
ed65330261 RemoveAttentionSqueeze 2023-01-16 13:18:29 +08:00
Daniel Povey
fb30d11693 Merge branch 'scaled_adam_exp891' into scaled_adam_exp896 2023-01-15 12:52:41 +08:00
Daniel Povey
048b6b6259 Make scale in NonlinAttention have glu nonlinearity. 2023-01-15 00:21:01 +08:00
Daniel Povey
eeadc3b0cc Add a multiplication to NonlinAttentionModule 2023-01-14 20:41:30 +08:00
Daniel Povey
4fe91ce67c Double hidden_channels in NonlinAttention from embed_dim//4 to embed_dim//2. 2023-01-14 17:19:34 +08:00
Daniel Povey
ec8804283c Try to make SmallConvolutionModule more efficient 2023-01-14 14:54:46 +08:00
Daniel Povey
167b58baa0 Make output dim of Zipformer be max dim 2023-01-14 14:29:29 +08:00
Daniel Povey
fb7a967276 Increase unmasked dims 2023-01-13 17:38:11 +08:00
Daniel Povey
bebc27f274 Increasing encoder-dim of some layers, and unmasked-dim 2023-01-13 17:36:45 +08:00
Daniel Povey
e6af583ee1 Increase encoder-dim of slowest stack from 320 to 384 2023-01-13 14:40:42 +08:00
Daniel Povey
a88587dc8a Fix comment; have 6, not 4, layers in most-downsampled stack. 2023-01-13 00:12:46 +08:00
Daniel Povey
5958f1ee11 Remove memory-allocated printouts 2023-01-12 22:14:52 +08:00
Daniel Povey
bac72718f0 Bug fixes, config changes 2023-01-12 22:11:42 +08:00
Daniel Povey
d3b3592986 Fix bug to allow down+up sampling 2023-01-12 21:18:34 +08:00
Daniel Povey
1e04c3d892 Reduce dimension for speed, have varying dims 2023-01-12 21:15:39 +08:00
Daniel Povey
9e4b84f374 Simplify Conv2dSubsampling, removing all but one ConvNext layer 2023-01-12 20:14:51 +08:00
Daniel Povey
65f15c9d14 Reduce final_layerdrop_rate coefficient. 2023-01-12 20:00:49 +08:00
Daniel Povey
3fdfec1049 Replace dropout2 on Conv2dSubsampling with Dropout3, share time dim 2023-01-11 13:18:08 +08:00
Daniel Povey
1774853bdf Remove caching eval 2023-01-11 13:12:25 +08:00
Daniel Povey
1580c1c1cc Fix MulForDropout3 2023-01-11 12:26:41 +08:00
Daniel Povey
8bbcd81604 Memory efficient backprop for dropout3 2023-01-10 17:46:32 +08:00
Daniel Povey
4033000730 Share dropout masks across time in ff modules 2023-01-10 17:12:32 +08:00
Daniel Povey
3110ed045a Increase base final_layerdrop_rate from 0.035 to 0.05 2023-01-09 23:32:36 +08:00
Daniel Povey
1d40239d69 Merge branch 'scaled_adam_exp872' into scaled_adam_exp873 2023-01-09 14:52:48 +08:00
Daniel Povey
e739d8aa38 Fix layer_skip_rate so it's actually used, increase its value. 2023-01-09 13:34:32 +08:00
Daniel Povey
1a0155fcb5 Merge branch 'scaled_adam_exp863' into scaled_adam_exp870
# Conflicts:
#	egs/librispeech/ASR/pruned_transducer_stateless7/scaling.py
2023-01-08 23:36:29 +08:00
Daniel Povey
326cb75033 Increase layer_skip_rate slightly 2023-01-08 15:48:23 +08:00
Daniel Povey
62b42887b4 Revert zipformer.py to status on previous commit 2023-01-08 13:17:39 +08:00
Daniel Povey
e952598677 Merge branch 'scaled_adam_exp846' into scaled_adam_exp866 2023-01-08 13:16:24 +08:00
Daniel Povey
117db124d0 Implement higher layerdrop for central stacks 2023-01-08 13:16:10 +08:00
Daniel Povey
c7107ead64 Fix bug in get_adjusted_batch_count 2023-01-07 17:45:22 +08:00
Daniel Povey
b3527fe4ac Implement caching evaluation for ConvNeXt 2023-01-07 17:31:20 +08:00
Daniel Povey
9242800d42 Remove the 8x-subsampled stack 2023-01-07 12:59:57 +08:00
Daniel Povey
ef48019d6e Reduce feedforward-dims 2023-01-06 22:26:58 +08:00
Daniel Povey
9b0c0aabb2 Merge branch 'scaled_adam_exp829' into scaled_adam_exp860
# Conflicts:
#	egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
2023-01-06 22:24:45 +08:00
Daniel Povey
6a762914bf Increase base-lr from 0.05 t to 0.055 2023-01-06 13:35:57 +08:00
Daniel Povey
5564a0efb0 Further tune lr scales; increase base-lr 2023-01-06 13:34:48 +08:00