1819 Commits

Author SHA1 Message Date
Daniel Povey
62c34f15c6 Remove print statement 2023-05-17 13:22:02 +08:00
Daniel Povey
e4246f6ba3 Reduce batch size from 24 to 22 2023-05-17 13:20:23 +08:00
Daniel Povey
6dce7e251d Increase batch size 2023-05-17 13:17:00 +08:00
Daniel Povey
53410608a6 Try to implement test mode; fix issue where middle stack had not been
downsampled.
2023-05-17 13:03:19 +08:00
Daniel Povey
30ace76fbc Add depthwise conv to decoder 2023-05-17 11:26:41 +08:00
Daniel Povey
610b2270aa Bug fixes 2023-05-16 23:08:13 +08:00
Daniel Povey
a405106d2f Add 1-d convolution to text embedding module; reduce batch size 2023-05-16 20:05:52 +08:00
Daniel Povey
399a79ace6 Change chunk-size setup 2023-05-16 19:47:23 +08:00
Daniel Povey
a6eb45840a Reduce batch size 2023-05-16 17:39:59 +08:00
Daniel Povey
e062c71076 Efficiency, small fix 2023-05-16 17:34:21 +08:00
Daniel Povey
cf93d1f129 Bug fix regarding chunk-size reshaping 2023-05-16 17:30:48 +08:00
Daniel Povey
5f5df4367d Fix error in how src was reshaped 2023-05-16 17:19:47 +08:00
Daniel Povey
0412d19f50 Increase batch size 2023-05-16 16:33:17 +08:00
Daniel Povey
3f72813a96 Various bug fixes, implementing chunking 2023-05-16 16:27:09 +08:00
Daniel Povey
0006a4c4db Implement chunk sizes, to the extent that the program runs. 2023-05-16 16:13:20 +08:00
Daniel Povey
4562b25a6a Remove unused options 2023-05-16 14:25:19 +08:00
Daniel Povey
bfeeddda81 Reduce mem consumption of softmax backward 2023-05-16 12:18:09 +08:00
Daniel Povey
465d41c429 Increase batch size 2023-05-16 12:13:13 +08:00
Daniel Povey
8001a46758 Fix bugs 2023-05-15 22:49:43 +08:00
Daniel Povey
cc81ec4f8a bug fix 2023-05-15 22:07:27 +08:00
Daniel Povey
0a76215fd7 Code cleanup 2023-05-15 22:01:19 +08:00
Daniel Povey
671e9ee5bd Restore old warmup schedule 2023-05-15 20:40:41 +08:00
Daniel Povey
d2d0ce0335 Try to get rid of gradient blowup 2023-05-15 20:26:21 +08:00
Daniel Povey
2e66392306 Change warmup schedule 2023-05-15 20:20:15 +08:00
Daniel Povey
532f95a627 Reduce batch size slightly 2023-05-15 20:13:48 +08:00
Daniel Povey
a397a5973b Increase num parameters 2023-05-15 20:11:20 +08:00
Daniel Povey
047c6ffc58 First version of subformer that runs. 2023-05-15 16:03:01 +08:00
Daniel Povey
1b8be0744f Fix various bugs 2023-05-15 15:20:02 +08:00
Daniel Povey
f740282a1a More progress on subformer 2023-05-15 10:57:48 +08:00
Daniel Povey
5c470fe397 rename zipformer to subformer, remove some things that won't be used. 2023-05-13 22:55:16 +08:00
Daniel Povey
2e4b27a1c8 Adding subformer as initially just a copy of zipformer 2023-05-13 21:30:24 +08:00
Daniel Povey
2f1d377727 Reduce batch size so it fits in memory 2023-05-04 17:01:30 +08:00
Daniel Povey
f0264bed1b Fix DDP issue; Change configurations, reducing subsampling factor; increase sequence length. 2023-05-04 16:18:31 +08:00
Daniel Povey
45f5e9981d Bug fix 2023-05-04 15:41:29 +08:00
Daniel Povey
86c2c60100 Step lr_scheduler on tokens not epoch; add some more debug output 2023-05-04 15:35:22 +08:00
Daniel Povey
3574e7dbb5 Initial version of zipformer1 LM that runs, not sure whether it is working 2023-05-04 14:46:06 +08:00
Daniel Povey
75e9f1a34a Fix bug with indicator 2023-05-02 13:36:03 +08:00
Daniel Povey
c207c55e94 alias Transducer 2023-05-02 13:19:21 +08:00
Daniel Povey
1ab2a4c662 Add text embeddings, but use actual text for now 2023-05-01 22:09:27 +08:00
Daniel Povey
fa696e919b Add memory to model 2023-05-01 20:47:09 +08:00
Daniel Povey
6f5c4688ef Add (back) straight_through_rate, with rate 0.025; try to handle memory allocation failures in backprop better. 2023-04-30 15:19:34 +08:00
Daniel Povey
e4626a14b8 Change length_factor from 3.0 to 1.0 2023-04-27 22:38:45 +08:00
Daniel Povey
6c26754628 Fix tests, make SwooshL and SwooshR more efficient in forward pass. 2023-04-27 22:37:19 +08:00
yaozengwei
55a1abc9da separate Conv2dSubsampling from Zipformer 2023-04-27 10:11:47 +08:00
yaozengwei
0ec31c84da remove skip_modules 2023-04-24 15:50:12 +08:00
yaozengwei
2e80841790 set --lr-batches=7500 2023-04-24 15:49:41 +08:00
yaozengwei
9291a39f58 remove all lr_scales, set layer3_channels=128, change the position of feed_forward1 2023-04-24 15:45:38 +08:00
yaozengwei
2cd1933873 remove similar-named args in decode.py 2023-04-14 14:24:57 +08:00
yaozengwei
87d9491fba minor fix in decode.py, about args 2023-04-13 17:20:25 +08:00
yaozengwei
d27e61170b set --base-lr=0.045 as default 2023-04-12 19:12:07 +08:00