Yuekai Zhang 5df24c1685
Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483)
* add whisper fbank for wenetspeech

* add whisper fbank for other dataset

* add str to bool

* add decode for wenetspeech

* add requirments.txt

* add original model decode with 30s

* test feature extractor speed

* add aishell2 feat

* change compute feature batch

* fix overwrite

* fix executor

* regression

* add kaldifeatwhisper fbank

* fix io issue

* parallel jobs

* use multi machines

* add wenetspeech fine-tune scripts

* add monkey patch codes

* remove useless file

* fix subsampling factor

* fix too long audios

* add remove long short

* fix whisper version to support multi batch beam

* decode all wav files

* remove utterance more than 30s in test_net

* only test net

* using soft links

* add kespeech whisper feats

* fix index error

* add manifests for whisper

* change to licomchunky writer

* add missing option

* decrease cpu usage 

* add speed perturb for kespeech

* fix kespeech speed perturb

* add dataset

* load checkpoint from specific path

* add speechio

* add speechio results

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2024-03-07 19:04:27 +08:00

40 lines
983 B
Markdown

# Introduction
This recipe includes scripts for training Zipformer model using multiple Chinese datasets.
# Included Training Sets
1. THCHS-30
2. AiShell-{1,2,4}
3. ST-CMDS
4. Primewords
5. MagicData
6. Aidatatang_200zh
7. AliMeeting
8. WeNetSpeech
9. KeSpeech-ASR
|Datset| Number of hours| URL|
|---|---:|---|
|**TOTAL**|14,106|---|
|THCHS-30|35|https://www.openslr.org/18/|
|AiShell-1|170|https://www.openslr.org/33/|
|AiShell-2|1,000|http://www.aishelltech.com/aishell_2|
|AiShell-4|120|https://www.openslr.org/111/|
|ST-CMDS|110|https://www.openslr.org/38/|
|Primewords|99|https://www.openslr.org/47/|
|aidatatang_200zh|200|https://www.openslr.org/62/|
|MagicData|755|https://www.openslr.org/68/|
|AliMeeting|100|https://openslr.org/119/|
|WeNetSpeech|10,000|https://github.com/wenet-e2e/WenetSpeech|
|KeSpeech|1,542|https://github.com/KeSpeech/KeSpeech|
# Included Test Sets
1. Aishell-{1,2,4}
2. Aidatatang_200zh
3. AliMeeting
4. MagicData
5. KeSpeech-ASR
6. WeNetSpeech