1049 Commits

Author SHA1 Message Date
Yuekai Zhang
50b575a2f1 load checkpoint from specific path 2024-03-05 16:37:29 +08:00
Yuekai Zhang
73a7687d8a add dataset 2024-02-23 17:40:23 +08:00
Yuekai Zhang
fa58ed2d2b fix kespeech speed perturb 2024-02-23 10:09:28 +08:00
Yuekai Zhang
73e5caecc5 add speed perturb for kespeech 2024-02-23 09:54:26 +08:00
Yuekai Zhang
5a62723f19 decrease cpu 2024-02-22 20:54:10 +08:00
Yuekai Zhang
f893ae200c add missing option 2024-02-22 20:44:52 +08:00
Yuekai Zhang
0212266730 change to licomchunky writer 2024-02-22 16:21:23 +08:00
Yuekai Zhang
910e5db931 add manifests for whisper 2024-02-22 15:55:01 +08:00
Yuekai Zhang
be001a896c fix index error 2024-02-20 10:20:00 +08:00
Yuekai Zhang
6fd14d202b add kespeech whisper feats 2024-02-19 23:03:49 +08:00
Yuekai Zhang
ff75cf6cb3 using soft links 2024-01-31 14:12:59 +08:00
Yuekai Zhang
97aa482ead only test net 2024-01-31 14:02:39 +08:00
Yuekai Zhang
955d16e6b8 only test net 2024-01-31 14:02:39 +08:00
Yuekai Zhang
4826f0801c remove utterance more than 30s in test_net 2024-01-31 14:02:39 +08:00
Yuekai Zhang
d8a329eca5 decode all wav files 2024-01-31 14:02:39 +08:00
Yuekai Zhang
341c29e6e2 fix whisper version to support multi batch beam 2024-01-31 14:02:39 +08:00
Yuekai Zhang
c19891ee8e add remove long short 2024-01-31 14:02:39 +08:00
Yuekai Zhang
bb07b65e45 add remove long short 2024-01-31 14:02:39 +08:00
Yuekai Zhang
1600f7db95 fix too long audios 2024-01-31 14:02:39 +08:00
Yuekai Zhang
b76cd65abf fix subsampling factor 2024-01-31 14:02:39 +08:00
Yuekai Zhang
ad796d929d remove useless file 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e49534f2dd add monkey patch codes 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e1a55b945b add wenetspeech fine-tune scripts 2024-01-31 14:02:39 +08:00
Yuekai Zhang
baa7c5fb8d use multi machines 2024-01-31 14:02:39 +08:00
Yuekai Zhang
cf85019290 parallel jobs 2024-01-31 14:02:39 +08:00
Yuekai Zhang
df54121c41 fix io issue 2024-01-31 14:02:39 +08:00
Yuekai Zhang
af29455c3d add kaldifeatwhisper fbank 2024-01-31 14:02:39 +08:00
Yuekai Zhang
08db3051ad regression 2024-01-31 14:02:39 +08:00
Yuekai Zhang
f66b266aa4 fix executor 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e46e9b77ee fix overwrite 2024-01-31 14:02:39 +08:00
Yuekai Zhang
fd77c5758c change compute feature batch 2024-01-31 14:02:39 +08:00
Yuekai Zhang
f4cf9fb2d3 add aishell2 feat 2024-01-31 14:02:39 +08:00
Yuekai Zhang
aa7b17e410 test feature extractor speed 2024-01-31 14:02:39 +08:00
Yuekai Zhang
d1b010463c add original model decode with 30s 2024-01-31 14:02:39 +08:00
Yuekai Zhang
38f5f45c67 add requirments.txt 2024-01-31 14:02:39 +08:00
Yuekai Zhang
72c9d01724 add decode for wenetspeech 2024-01-31 14:02:39 +08:00
Yuekai Zhang
046e071ca3 add str to bool 2024-01-31 14:02:39 +08:00
Yuekai Zhang
315175a362 add whisper fbank for other dataset 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e43c4da91d add whisper fbank for wenetspeech 2024-01-31 14:02:39 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech (#1476)
* Comply to issue #1149

https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 (#1466)
* add decode seamlessm4t

* add requirements

* add decoding with avg model

* add token files

* add custom tokenizer

* support deepspeed to finetune large model

* support large-v3

* add model saving

* using monkey patch to replace models

* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir (#1475) 2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py (#1474) 2024-01-26 15:50:11 +08:00
zr_jin
9c494a3329
typos fixed (#1472) 2024-01-25 18:41:43 +08:00
Yifan Yang
559ed150bb
Fix typo (#1471) 2024-01-23 22:51:09 +08:00
zr_jin
ebe97a07b0
Reworked README.md (#1470)
* Rework README.md

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-01-23 16:26:24 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler (#1468)
* Fix buffer size

* Fix for flake8

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
7bdde9174c
A Zipformer recipe with Byte-level BPE for Aishell-1 (#1464)
* init commit

* Update train.py

* Update decode.py

* Update RESULTS.md

* added `vocab_size`

* removed unused softlinks

* added scripts for testing pretrained models

* set `bpe_model` as required

* re-org the bbpe recipe for aishell
2024-01-16 21:08:35 +08:00
Fangjun Kuang
398401ed27
Update kaldifeat installation doc (#1460) 2024-01-14 14:38:41 +08:00
Xiaoyu Yang
e2fcb42f5f
fix typo (#1455) 2024-01-09 15:41:37 +08:00