From 428579e3ac4f4999790fc12e49d70c3553da2623 Mon Sep 17 00:00:00 2001 From: jinzr Date: Wed, 22 Nov 2023 17:11:19 +0800 Subject: [PATCH] minor updates --- egs/multi_zh_en/ASR/README.md | 19 +++++++++++++++ egs/multi_zh_en/ASR/RESULTS.md | 44 ++++++++++++++++++++++++++++++++++ egs/multi_zh_en/ASR/prepare.sh | 1 - 3 files changed, 63 insertions(+), 1 deletion(-) diff --git a/egs/multi_zh_en/ASR/README.md b/egs/multi_zh_en/ASR/README.md index e69de29bb..29341571d 100644 --- a/egs/multi_zh_en/ASR/README.md +++ b/egs/multi_zh_en/ASR/README.md @@ -0,0 +1,19 @@ +# Introduction + +This recipe includes scripts for training Zipformer model using both English and Chinese datasets. + +# Included Training Sets + +1. LibriSpeech (English) +2. AiShell-2 (Chinese) +3. TAL-CSASR (Code-Switching, Chinese and English) + +|Datset| Number of hours| URL| +|---|---:|---| +|**TOTAL**|2,547|---| +|LibriSpeech|960|https://www.openslr.org/12/| +|AiShell-2|1,000|http://www.aishelltech.com/aishell_2| +|TAL-CSASR|587|https://ai.100tal.com/openData/voice| + + + diff --git a/egs/multi_zh_en/ASR/RESULTS.md b/egs/multi_zh_en/ASR/RESULTS.md index e69de29bb..8ce389ac2 100644 --- a/egs/multi_zh_en/ASR/RESULTS.md +++ b/egs/multi_zh_en/ASR/RESULTS.md @@ -0,0 +1,44 @@ +## Results + +### Zh-En datasets bpe-based training results (Non-streaming) on Zipformer model + +This is the [pull request #1238](https://github.com/k2-fsa/icefall/pull/1265) in icefall. + +#### Non-streaming (Byte-Level BPE vocab_size=2000) + +Best results (num of params : ~69M): + +The training command: + +``` +./zipformer/train.py \ + --world-size 4 \ + --num-epochs 35 \ + --use-fp16 1 \ + --max-duration 1000 \ + --num-workers 8 +``` + +The decoding command: + +``` +for method in greedy_search modified_beam_search fast_beam_search; do + ./zipformer/decode.py \ + --epoch 34 \ + --avg 19 \ + --decoding_method $method +done +``` + +Word Error Rates (WERs) listed below are produced by the checkpoint of the 20th epoch using greedy search and BPE model (# tokens is 2000). + +| Datasets | TAL-CSASR | TAL-CSASR | +|----------------------|-----------|-----------| +| Zipformer WER (%) | dev | test | +| greedy_search | 6.65 | 6.69 | +| modified_beam_search | 6.46 | 6.51 | +| fast_beam_search | 6.57 | 6.68 | + +Pre-trained model can be found here : https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22, which is trained on LibriSpeech 960-hour training set (with speed perturbation), TAL-CSASR training set (with speed perturbation) and AiShell-2 (w/o speed perturbation). + + diff --git a/egs/multi_zh_en/ASR/prepare.sh b/egs/multi_zh_en/ASR/prepare.sh index a6808363d..566ed29b5 100755 --- a/egs/multi_zh_en/ASR/prepare.sh +++ b/egs/multi_zh_en/ASR/prepare.sh @@ -13,7 +13,6 @@ dl_dir=$PWD/download . shared/parse_options.sh || exit 1 vocab_sizes=( - 500 2000 )