Daniel Povey
|
a02199df79
|
Fix bug
|
2023-03-29 20:33:43 +08:00 |
|
Daniel Povey
|
f1dbf4222e
|
Divide feature_mask into 3 groups
|
2023-03-29 16:22:39 +08:00 |
|
Daniel Povey
|
b8f0756133
|
Add comment
|
2023-03-29 14:05:28 +08:00 |
|
Daniel Povey
|
bb8cbd7598
|
Sometimes mask more frames.
|
2023-03-29 13:08:52 +08:00 |
|
Daniel Povey
|
4e36656cef
|
Remove import that is no longer there
|
2023-03-10 14:45:02 +08:00 |
|
Daniel Povey
|
07b685936a
|
Fix typo
|
2023-03-07 21:53:12 +08:00 |
|
Daniel Povey
|
e692e0b228
|
Add balancer for keys
|
2023-03-07 17:39:01 +08:00 |
|
Daniel Povey
|
f59da65d82
|
Remove some more unused code; rename BasicNorm->BiasNorm, Zipformer->Zipformer2
|
2023-03-06 14:27:11 +08:00 |
|
Daniel Povey
|
3424b60d8f
|
Remove some unused code
|
2023-03-06 14:18:01 +08:00 |
|
Daniel Povey
|
0191e8f3e4
|
Simplify how dim changes are dealt with; see also scaled_adam_exp977
|
2023-02-22 11:40:33 +08:00 |
|
Daniel Povey
|
90180ce5e7
|
Make layer-skip-dropout-prob decrease to 0.0
|
2023-02-20 16:33:04 +08:00 |
|
Daniel Povey
|
e0b8a0cfd0
|
Fix batch_size position bug in layer_skip
|
2023-02-16 15:13:06 +08:00 |
|
Daniel Povey
|
686e7e8828
|
Remove some unhelpful or unused options in decode.py, setting equivalent to --left-context=0
for padding. Restore default of causal training.
|
2023-02-13 12:58:33 +08:00 |
|
Daniel Povey
|
a5fb97d298
|
Merge branch 'scaled_adam_exp999' into scaled_adam_exp1002
|
2023-02-13 12:49:49 +08:00 |
|
Daniel Povey
|
5842de9464
|
Prevent left_context_chunks from being 0.
|
2023-02-13 12:49:16 +08:00 |
|
Daniel Povey
|
dc481ca419
|
Disable causal training; add balancers in decoder.
|
2023-02-11 23:10:21 +08:00 |
|
Daniel Povey
|
f9f546968c
|
Revert warmup_batches change; make code change to avoid non in attn_weights
|
2023-02-11 18:46:05 +08:00 |
|
Daniel Povey
|
b0c87a93d2
|
Increase warmup of LR from 500 to 1000 batches
|
2023-02-11 18:27:20 +08:00 |
|
Daniel Povey
|
db543866d8
|
Remove unused debug statement.
|
2023-02-11 17:43:37 +08:00 |
|
Daniel Povey
|
8ccd061051
|
Fix bug where attn_mask was not passed in.
|
2023-02-11 17:31:21 +08:00 |
|
Daniel Povey
|
e9157535a4
|
Remove unused variable
|
2023-02-11 15:53:33 +08:00 |
|
Daniel Povey
|
4b27ffa911
|
Fix causal option in SmallConvolutionModule (unused)
|
2023-02-11 15:53:09 +08:00 |
|
Daniel Povey
|
49627c6251
|
Cosmetic fix to formula
|
2023-02-11 15:38:03 +08:00 |
|
Daniel Povey
|
3cb43c3c36
|
Fix issue in decode.py
|
2023-02-11 14:45:45 +08:00 |
|
Daniel Povey
|
329175c897
|
Change how chunk-size is specified
|
2023-02-11 14:35:31 +08:00 |
|
Daniel Povey
|
ad388890d9
|
Make most forms of sequence dropout be separate per sequence.
|
2023-02-10 16:34:01 +08:00 |
|
Daniel Povey
|
e7e7560bba
|
Implement chunking
|
2023-02-10 15:02:29 +08:00 |
|
Daniel Povey
|
b2fb504aee
|
Merge branch 'scaled_adam_exp912' into scaled_adam_exp994
|
2023-02-08 21:15:21 +08:00 |
|
Daniel Povey
|
659ca97001
|
Remove small_conv_module and make nonlin_attention_module slightly wider
|
2023-02-08 13:56:22 +08:00 |
|
Daniel Povey
|
b2303e02c5
|
Revert "Make scale in NonlinAttention have glu nonlinearity."
This reverts commit 048b6b6259a715c4b8225d493fdcd8df88e42b1f.
|
2023-01-18 11:27:57 +08:00 |
|
Daniel Povey
|
80b2c751e3
|
Merge branch 'scaled_adam_exp896' into scaled_adam_exp904
|
2023-01-16 13:18:42 +08:00 |
|
Daniel Povey
|
ed65330261
|
RemoveAttentionSqueeze
|
2023-01-16 13:18:29 +08:00 |
|
Daniel Povey
|
fb30d11693
|
Merge branch 'scaled_adam_exp891' into scaled_adam_exp896
|
2023-01-15 12:52:41 +08:00 |
|
Daniel Povey
|
048b6b6259
|
Make scale in NonlinAttention have glu nonlinearity.
|
2023-01-15 00:21:01 +08:00 |
|
Daniel Povey
|
eeadc3b0cc
|
Add a multiplication to NonlinAttentionModule
|
2023-01-14 20:41:30 +08:00 |
|
Daniel Povey
|
4fe91ce67c
|
Double hidden_channels in NonlinAttention from embed_dim//4 to embed_dim//2.
|
2023-01-14 17:19:34 +08:00 |
|
Daniel Povey
|
ec8804283c
|
Try to make SmallConvolutionModule more efficient
|
2023-01-14 14:54:46 +08:00 |
|
Daniel Povey
|
167b58baa0
|
Make output dim of Zipformer be max dim
|
2023-01-14 14:29:29 +08:00 |
|
Daniel Povey
|
fb7a967276
|
Increase unmasked dims
|
2023-01-13 17:38:11 +08:00 |
|
Daniel Povey
|
bebc27f274
|
Increasing encoder-dim of some layers, and unmasked-dim
|
2023-01-13 17:36:45 +08:00 |
|
Daniel Povey
|
e6af583ee1
|
Increase encoder-dim of slowest stack from 320 to 384
|
2023-01-13 14:40:42 +08:00 |
|
Daniel Povey
|
a88587dc8a
|
Fix comment; have 6, not 4, layers in most-downsampled stack.
|
2023-01-13 00:12:46 +08:00 |
|
Daniel Povey
|
5958f1ee11
|
Remove memory-allocated printouts
|
2023-01-12 22:14:52 +08:00 |
|
Daniel Povey
|
bac72718f0
|
Bug fixes, config changes
|
2023-01-12 22:11:42 +08:00 |
|
Daniel Povey
|
d3b3592986
|
Fix bug to allow down+up sampling
|
2023-01-12 21:18:34 +08:00 |
|
Daniel Povey
|
1e04c3d892
|
Reduce dimension for speed, have varying dims
|
2023-01-12 21:15:39 +08:00 |
|
Daniel Povey
|
9e4b84f374
|
Simplify Conv2dSubsampling, removing all but one ConvNext layer
|
2023-01-12 20:14:51 +08:00 |
|
Daniel Povey
|
65f15c9d14
|
Reduce final_layerdrop_rate coefficient.
|
2023-01-12 20:00:49 +08:00 |
|
Daniel Povey
|
3fdfec1049
|
Replace dropout2 on Conv2dSubsampling with Dropout3, share time dim
|
2023-01-11 13:18:08 +08:00 |
|
Daniel Povey
|
1774853bdf
|
Remove caching eval
|
2023-01-11 13:12:25 +08:00 |
|