mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-26 18:24:18 +00:00
added results on zh-HK
This commit is contained in:
parent
6993183dd7
commit
4237127be2
@ -1,4 +1,73 @@
|
|||||||
## Results
|
## Results
|
||||||
|
|
||||||
|
### Commonvoice Cantonese (zh-HK) Char training results (Zipformer)
|
||||||
|
|
||||||
|
See #1542 for more details.
|
||||||
|
|
||||||
|
Number of model parameters: 72526519, i.e., 72.53 M
|
||||||
|
|
||||||
|
The best CER, for CommonVoice 16.1 (cv-corpus-16.1-2023-12-06/zh-HK) is below:
|
||||||
|
|
||||||
|
| | Dev | Test | Note |
|
||||||
|
|----------------------|-------|------|--------------------|
|
||||||
|
| greedy_search | 1.17 | 1.22 | --epoch 24 --avg 5 |
|
||||||
|
| modified_beam_search | 0.98 | 1.11 | --epoch 24 --avg 5 |
|
||||||
|
| fast_beam_search | 1.08 | 1.27 | --epoch 24 --avg 5 |
|
||||||
|
|
||||||
|
When doing the cross-corpus validation on MDCC (w/o blank penalty),
|
||||||
|
the best CER is below:
|
||||||
|
|
||||||
|
| | Dev | Test | Note |
|
||||||
|
|----------------------|-------|------|--------------------|
|
||||||
|
| greedy_search | 42.40 | 42.03| --epoch 24 --avg 5 |
|
||||||
|
| modified_beam_search | 39.73 | 39.19| --epoch 24 --avg 5 |
|
||||||
|
| fast_beam_search | 42.14 | 41.98| --epoch 24 --avg 5 |
|
||||||
|
|
||||||
|
When doing the cross-corpus validation on MDCC (with blank penalty set to 2.2),
|
||||||
|
the best CER is below:
|
||||||
|
|
||||||
|
| | Dev | Test | Note |
|
||||||
|
|----------------------|-------|------|----------------------------------------|
|
||||||
|
| greedy_search | 39.19 | 39.09| --epoch 24 --avg 5 --blank-penalty 2.2 |
|
||||||
|
| modified_beam_search | 37.73 | 37.65| --epoch 24 --avg 5 --blank-penalty 2.2 |
|
||||||
|
| fast_beam_search | 37.73 | 37.74| --epoch 24 --avg 5 --blank-penalty 2.2 |
|
||||||
|
|
||||||
|
To reproduce the above result, use the following commands for training:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export CUDA_VISIBLE_DEVICES="0,1"
|
||||||
|
./zipformer/train_char.py \
|
||||||
|
--world-size 2 \
|
||||||
|
--num-epochs 30 \
|
||||||
|
--start-epoch 1 \
|
||||||
|
--use-fp16 1 \
|
||||||
|
--exp-dir zipformer/exp \
|
||||||
|
--cv-manifest-dir data/zh-HK/fbank \
|
||||||
|
--language zh-HK \
|
||||||
|
--use-validated-set 1 \
|
||||||
|
--context-size 1 \
|
||||||
|
--max-duration 1000
|
||||||
|
```
|
||||||
|
|
||||||
|
and the following commands for decoding:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
for method in greedy_search modified_beam_search fast_beam_search; do
|
||||||
|
./zipformer/decode_char.py \
|
||||||
|
--epoch 24 \
|
||||||
|
--avg 5 \
|
||||||
|
--decoding-method $method \
|
||||||
|
--exp-dir zipformer/exp \
|
||||||
|
--cv-manifest-dir data/zh-HK/fbank \
|
||||||
|
--context-size 1 \
|
||||||
|
--language zh-HK
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
Detailed experimental results and pre-trained model are available at:
|
||||||
|
<https://huggingface.co/zrjin/icefall-asr-commonvoice-zh-HK-zipformer-2024-03-20>
|
||||||
|
|
||||||
|
|
||||||
### GigaSpeech BPE training results (Pruned Stateless Transducer 7)
|
### GigaSpeech BPE training results (Pruned Stateless Transducer 7)
|
||||||
|
|
||||||
#### [pruned_transducer_stateless7](./pruned_transducer_stateless7)
|
#### [pruned_transducer_stateless7](./pruned_transducer_stateless7)
|
||||||
@ -13,8 +82,8 @@ Results are:
|
|||||||
|
|
||||||
| | Dev | Test |
|
| | Dev | Test |
|
||||||
|----------------------|-------|-------|
|
|----------------------|-------|-------|
|
||||||
| greedy search | 9.96 | 12.54 |
|
| greedy_search | 9.96 | 12.54 |
|
||||||
| modified beam search | 9.86 | 12.48 |
|
| modified_beam_search | 9.86 | 12.48 |
|
||||||
|
|
||||||
To reproduce the above result, use the following commands for training:
|
To reproduce the above result, use the following commands for training:
|
||||||
|
|
||||||
@ -55,10 +124,6 @@ and the following commands for decoding:
|
|||||||
Pretrained model is available at
|
Pretrained model is available at
|
||||||
<https://huggingface.co/yfyeung/icefall-asr-cv-corpus-13.0-2023-03-09-en-pruned-transducer-stateless7-2023-04-17>
|
<https://huggingface.co/yfyeung/icefall-asr-cv-corpus-13.0-2023-03-09-en-pruned-transducer-stateless7-2023-04-17>
|
||||||
|
|
||||||
The tensorboard log for training is available at
|
|
||||||
<https://tensorboard.dev/experiment/j4pJQty6RMOkMJtRySREKw/>
|
|
||||||
|
|
||||||
|
|
||||||
### Commonvoice (fr) BPE training results (Pruned Stateless Transducer 7_streaming)
|
### Commonvoice (fr) BPE training results (Pruned Stateless Transducer 7_streaming)
|
||||||
|
|
||||||
#### [pruned_transducer_stateless7_streaming](./pruned_transducer_stateless7_streaming)
|
#### [pruned_transducer_stateless7_streaming](./pruned_transducer_stateless7_streaming)
|
||||||
@ -73,9 +138,9 @@ Results are:
|
|||||||
|
|
||||||
| decoding method | Test |
|
| decoding method | Test |
|
||||||
|----------------------|-------|
|
|----------------------|-------|
|
||||||
| greedy search | 9.95 |
|
| greedy_search | 9.95 |
|
||||||
| modified beam search | 9.57 |
|
| modified_beam_search | 9.57 |
|
||||||
| fast beam search | 9.67 |
|
| fast_beam_search | 9.67 |
|
||||||
|
|
||||||
Note: This best result is trained on the full librispeech and gigaspeech, and then fine-tuned on the full commonvoice.
|
Note: This best result is trained on the full librispeech and gigaspeech, and then fine-tuned on the full commonvoice.
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user