icefall/RESULTS.md at 9a940c3376781ecc528fdf17137018fccc54634b

mirrors/icefall

Fork 0

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-09 14:05:33 +00:00

Bailey Machiko Hirota 9a940c3376

Update RESULTS.md

2025-09-02 11:48:58 +09:00

1.8 KiB

Raw Blame History

Results

Zipformer

Non-streaming

The training command is:

./zipformer/train.py \
  --world-size 4 \
  --num-epochs 21 \
  --start-epoch 1 \
  --use-fp16 1 \
  --exp-dir zipformer/exp \
  --manifest-dir data/manifests

The decoding command is:

./zipformer/decode.py \
    --epoch 21 \
    --avg 15 \
    --exp-dir ./zipformer/exp \
    --max-duration 600 \
    --decoding-method greedy_search

To export the model with onnx:

./zipformer/export-onnx.py \
  --tokens ./data/lang/bbpe_2000/tokens.txt \
  --use-averaged-model 0 \
  --epoch 21 \
  --avg 1 \
  --exp-dir ./zipformer/exp

Word Error Rates (WERs) listed below:

Datasets	ReazonSpeech	ReazonSpeech	LibriSpeech	LibriSpeech
Zipformer WER (%)	dev	test	test-clean	test-other
greedy_search	5.9	4.07	3.46	8.35
modified_beam_search	4.87	3.61	3.28	8.07

We also include WER% for common English ASR datasets:

Corpus	WER (%)
CommonVoice	29.03
TED	16.78
MLS English (test-clean)	8.64

And CER% for common Japanese datasets:

Corpus	CER (%)
JSUT	8.13
CommonVoice	9.82
TEDx	11.64

Pre-trained model can be found here: https://huggingface.co/reazon-research/reazonspeech-k2-v2-ja-en/tree/m

1.8 KiB Raw Blame History

Results

Zipformer

Non-streaming

1.8 KiB

Raw Blame History