Daniel Povey
|
62c34f15c6
|
Remove print statement
|
2023-05-17 13:22:02 +08:00 |
|
Daniel Povey
|
e4246f6ba3
|
Reduce batch size from 24 to 22
|
2023-05-17 13:20:23 +08:00 |
|
Daniel Povey
|
6dce7e251d
|
Increase batch size
|
2023-05-17 13:17:00 +08:00 |
|
Daniel Povey
|
53410608a6
|
Try to implement test mode; fix issue where middle stack had not been
downsampled.
|
2023-05-17 13:03:19 +08:00 |
|
Daniel Povey
|
30ace76fbc
|
Add depthwise conv to decoder
|
2023-05-17 11:26:41 +08:00 |
|
Daniel Povey
|
610b2270aa
|
Bug fixes
|
2023-05-16 23:08:13 +08:00 |
|
Daniel Povey
|
a405106d2f
|
Add 1-d convolution to text embedding module; reduce batch size
|
2023-05-16 20:05:52 +08:00 |
|
Daniel Povey
|
399a79ace6
|
Change chunk-size setup
|
2023-05-16 19:47:23 +08:00 |
|
Daniel Povey
|
a6eb45840a
|
Reduce batch size
|
2023-05-16 17:39:59 +08:00 |
|
Daniel Povey
|
e062c71076
|
Efficiency, small fix
|
2023-05-16 17:34:21 +08:00 |
|
Daniel Povey
|
cf93d1f129
|
Bug fix regarding chunk-size reshaping
|
2023-05-16 17:30:48 +08:00 |
|
Daniel Povey
|
5f5df4367d
|
Fix error in how src was reshaped
|
2023-05-16 17:19:47 +08:00 |
|
Daniel Povey
|
0412d19f50
|
Increase batch size
|
2023-05-16 16:33:17 +08:00 |
|
Daniel Povey
|
3f72813a96
|
Various bug fixes, implementing chunking
|
2023-05-16 16:27:09 +08:00 |
|
Daniel Povey
|
0006a4c4db
|
Implement chunk sizes, to the extent that the program runs.
|
2023-05-16 16:13:20 +08:00 |
|
Daniel Povey
|
4562b25a6a
|
Remove unused options
|
2023-05-16 14:25:19 +08:00 |
|
Daniel Povey
|
bfeeddda81
|
Reduce mem consumption of softmax backward
|
2023-05-16 12:18:09 +08:00 |
|
Daniel Povey
|
465d41c429
|
Increase batch size
|
2023-05-16 12:13:13 +08:00 |
|
Daniel Povey
|
8001a46758
|
Fix bugs
|
2023-05-15 22:49:43 +08:00 |
|
Daniel Povey
|
cc81ec4f8a
|
bug fix
|
2023-05-15 22:07:27 +08:00 |
|
Daniel Povey
|
0a76215fd7
|
Code cleanup
|
2023-05-15 22:01:19 +08:00 |
|
Daniel Povey
|
671e9ee5bd
|
Restore old warmup schedule
|
2023-05-15 20:40:41 +08:00 |
|
Daniel Povey
|
d2d0ce0335
|
Try to get rid of gradient blowup
|
2023-05-15 20:26:21 +08:00 |
|
Daniel Povey
|
2e66392306
|
Change warmup schedule
|
2023-05-15 20:20:15 +08:00 |
|
Daniel Povey
|
532f95a627
|
Reduce batch size slightly
|
2023-05-15 20:13:48 +08:00 |
|
Daniel Povey
|
a397a5973b
|
Increase num parameters
|
2023-05-15 20:11:20 +08:00 |
|
Daniel Povey
|
047c6ffc58
|
First version of subformer that runs.
|
2023-05-15 16:03:01 +08:00 |
|
Daniel Povey
|
1b8be0744f
|
Fix various bugs
|
2023-05-15 15:20:02 +08:00 |
|
Daniel Povey
|
f740282a1a
|
More progress on subformer
|
2023-05-15 10:57:48 +08:00 |
|
Daniel Povey
|
5c470fe397
|
rename zipformer to subformer, remove some things that won't be used.
|
2023-05-13 22:55:16 +08:00 |
|
Daniel Povey
|
2e4b27a1c8
|
Adding subformer as initially just a copy of zipformer
|
2023-05-13 21:30:24 +08:00 |
|
Daniel Povey
|
2f1d377727
|
Reduce batch size so it fits in memory
|
2023-05-04 17:01:30 +08:00 |
|
Daniel Povey
|
f0264bed1b
|
Fix DDP issue; Change configurations, reducing subsampling factor; increase sequence length.
|
2023-05-04 16:18:31 +08:00 |
|
Daniel Povey
|
45f5e9981d
|
Bug fix
|
2023-05-04 15:41:29 +08:00 |
|
Daniel Povey
|
86c2c60100
|
Step lr_scheduler on tokens not epoch; add some more debug output
|
2023-05-04 15:35:22 +08:00 |
|
Daniel Povey
|
3574e7dbb5
|
Initial version of zipformer1 LM that runs, not sure whether it is working
|
2023-05-04 14:46:06 +08:00 |
|
Daniel Povey
|
75e9f1a34a
|
Fix bug with indicator
|
2023-05-02 13:36:03 +08:00 |
|
Daniel Povey
|
c207c55e94
|
alias Transducer
|
2023-05-02 13:19:21 +08:00 |
|
Daniel Povey
|
1ab2a4c662
|
Add text embeddings, but use actual text for now
|
2023-05-01 22:09:27 +08:00 |
|
Daniel Povey
|
fa696e919b
|
Add memory to model
|
2023-05-01 20:47:09 +08:00 |
|
Daniel Povey
|
6f5c4688ef
|
Add (back) straight_through_rate, with rate 0.025; try to handle memory allocation failures in backprop better.
|
2023-04-30 15:19:34 +08:00 |
|
Daniel Povey
|
e4626a14b8
|
Change length_factor from 3.0 to 1.0
|
2023-04-27 22:38:45 +08:00 |
|
Daniel Povey
|
6c26754628
|
Fix tests, make SwooshL and SwooshR more efficient in forward pass.
|
2023-04-27 22:37:19 +08:00 |
|
yaozengwei
|
55a1abc9da
|
separate Conv2dSubsampling from Zipformer
|
2023-04-27 10:11:47 +08:00 |
|
yaozengwei
|
0ec31c84da
|
remove skip_modules
|
2023-04-24 15:50:12 +08:00 |
|
yaozengwei
|
2e80841790
|
set --lr-batches=7500
|
2023-04-24 15:49:41 +08:00 |
|
yaozengwei
|
9291a39f58
|
remove all lr_scales, set layer3_channels=128, change the position of feed_forward1
|
2023-04-24 15:45:38 +08:00 |
|
yaozengwei
|
2cd1933873
|
remove similar-named args in decode.py
|
2023-04-14 14:24:57 +08:00 |
|
yaozengwei
|
87d9491fba
|
minor fix in decode.py, about args
|
2023-04-13 17:20:25 +08:00 |
|
yaozengwei
|
d27e61170b
|
set --base-lr=0.045 as default
|
2023-04-12 19:12:07 +08:00 |
|