1864 Commits

Author SHA1 Message Date
Daniel Povey
b526f3af00 Increase num layers 2023-04-04 15:39:32 +08:00
Daniel Povey
c4f669ef00 Increase feedforward dims and num layers 2023-04-04 14:41:23 +08:00
Daniel Povey
7ab1e7f5ec Combine two layers into one. 2023-04-04 12:14:18 +08:00
Daniel Povey
3dd25d6b2d Increase feature_mask_dropout_prob to 0.125 2023-04-03 12:13:09 +08:00
Daniel Povey
c2e39bd488 Bug fix 2023-03-31 17:23:25 +08:00
Daniel Povey
cd0f48f508 Mask larger regions 2023-03-31 17:07:22 +08:00
Daniel Povey
d41b73000e Modify feature_mask_dropout_prob 2023-03-31 13:25:39 +08:00
Daniel Povey
e64ec396bd Have 2 not 3 groups, but give 1st group a smaller dropout prob than the 2nd. 2023-03-30 16:38:41 +08:00
Daniel Povey
6e058b9ebd Fix or vs. and bug 2023-03-30 00:00:59 +08:00
Daniel Povey
a02199df79 Fix bug 2023-03-29 20:33:43 +08:00
Daniel Povey
f1dbf4222e Divide feature_mask into 3 groups 2023-03-29 16:22:39 +08:00
Daniel Povey
b8f0756133 Add comment 2023-03-29 14:05:28 +08:00
Daniel Povey
bb8cbd7598 Sometimes mask more frames. 2023-03-29 13:08:52 +08:00
Daniel Povey
4e36656cef Remove import that is no longer there 2023-03-10 14:45:02 +08:00
Daniel Povey
07b685936a Fix typo 2023-03-07 21:53:12 +08:00
Daniel Povey
e692e0b228 Add balancer for keys 2023-03-07 17:39:01 +08:00
Daniel Povey
f59da65d82 Remove some more unused code; rename BasicNorm->BiasNorm, Zipformer->Zipformer2 2023-03-06 14:27:11 +08:00
Daniel Povey
3424b60d8f Remove some unused code 2023-03-06 14:18:01 +08:00
Daniel Povey
54f087fead Fix to diagnostics 2023-02-24 16:13:26 +08:00
Daniel Povey
0191e8f3e4 Simplify how dim changes are dealt with; see also scaled_adam_exp977 2023-02-22 11:40:33 +08:00
Daniel Povey
90180ce5e7 Make layer-skip-dropout-prob decrease to 0.0 2023-02-20 16:33:04 +08:00
Daniel Povey
e0b8a0cfd0 Fix batch_size position bug in layer_skip 2023-02-16 15:13:06 +08:00
Daniel Povey
686e7e8828 Remove some unhelpful or unused options in decode.py, setting equivalent to --left-context=0
for padding.  Restore default of causal training.
2023-02-13 12:58:33 +08:00
Daniel Povey
a5fb97d298 Merge branch 'scaled_adam_exp999' into scaled_adam_exp1002 2023-02-13 12:49:49 +08:00
Daniel Povey
5842de9464 Prevent left_context_chunks from being 0. 2023-02-13 12:49:16 +08:00
Daniel Povey
dc481ca419 Disable causal training; add balancers in decoder. 2023-02-11 23:10:21 +08:00
Daniel Povey
f9f546968c Revert warmup_batches change; make code change to avoid non in attn_weights 2023-02-11 18:46:05 +08:00
Daniel Povey
b0c87a93d2 Increase warmup of LR from 500 to 1000 batches 2023-02-11 18:27:20 +08:00
Daniel Povey
db543866d8 Remove unused debug statement. 2023-02-11 17:43:37 +08:00
Daniel Povey
8ccd061051 Fix bug where attn_mask was not passed in. 2023-02-11 17:31:21 +08:00
Daniel Povey
e9157535a4 Remove unused variable 2023-02-11 15:53:33 +08:00
Daniel Povey
4b27ffa911 Fix causal option in SmallConvolutionModule (unused) 2023-02-11 15:53:09 +08:00
Daniel Povey
49627c6251 Cosmetic fix to formula 2023-02-11 15:38:03 +08:00
Daniel Povey
3cb43c3c36 Fix issue in decode.py 2023-02-11 14:45:45 +08:00
Daniel Povey
329175c897 Change how chunk-size is specified 2023-02-11 14:35:31 +08:00
Daniel Povey
ad388890d9 Make most forms of sequence dropout be separate per sequence. 2023-02-10 16:34:01 +08:00
Daniel Povey
e7e7560bba Implement chunking 2023-02-10 15:02:29 +08:00
Daniel Povey
b2fb504aee Merge branch 'scaled_adam_exp912' into scaled_adam_exp994 2023-02-08 21:15:21 +08:00
Daniel Povey
659ca97001 Remove small_conv_module and make nonlin_attention_module slightly wider 2023-02-08 13:56:22 +08:00
Daniel Povey
b2303e02c5 Revert "Make scale in NonlinAttention have glu nonlinearity."
This reverts commit 048b6b6259a715c4b8225d493fdcd8df88e42b1f.
2023-01-18 11:27:57 +08:00
Daniel Povey
80b2c751e3 Merge branch 'scaled_adam_exp896' into scaled_adam_exp904 2023-01-16 13:18:42 +08:00
Daniel Povey
ed65330261 RemoveAttentionSqueeze 2023-01-16 13:18:29 +08:00
Daniel Povey
fb30d11693 Merge branch 'scaled_adam_exp891' into scaled_adam_exp896 2023-01-15 12:52:41 +08:00
Daniel Povey
048b6b6259 Make scale in NonlinAttention have glu nonlinearity. 2023-01-15 00:21:01 +08:00
Daniel Povey
eeadc3b0cc Add a multiplication to NonlinAttentionModule 2023-01-14 20:41:30 +08:00
Daniel Povey
4fe91ce67c Double hidden_channels in NonlinAttention from embed_dim//4 to embed_dim//2. 2023-01-14 17:19:34 +08:00
Daniel Povey
ec8804283c Try to make SmallConvolutionModule more efficient 2023-01-14 14:54:46 +08:00
Daniel Povey
167b58baa0 Make output dim of Zipformer be max dim 2023-01-14 14:29:29 +08:00
Daniel Povey
fb7a967276 Increase unmasked dims 2023-01-13 17:38:11 +08:00
Daniel Povey
bebc27f274 Increasing encoder-dim of some layers, and unmasked-dim 2023-01-13 17:36:45 +08:00