* copy files from existing branch * add rule in .flake8 * monir style fix * fix typos * add tail padding * refactor, use fixed-length cache for batch decoding * copy from streaming branch * copy from streaming branch * modify emformer states stack and unstack, streaming decoding, to be continued * refactor Stream class * remane streaming_feature_extractor.py * refactor streaming decoding * test states stack and unstack * fix bugs, no grad, and num_proccessed_frames * add modify_beam_search, fast_beam_search * support torch.jit.export * use torch.div * copy from pruned_transducer_stateless4 * modify export.py * add author info * delete other test functions * minor fix * modify doc * fix style * minor fix doc * minor fix * minor fix doc * update RESULTS.md * fix typo * add info * fix typo * fix doc * add test function for conv module, and minor fix. * add copyright info * minor change of test_emformer.py * fix doc of stack and unstack, test case with batch_size=1 * update README.md

Introduction
icefall contains ASR recipes for various datasets using https://github.com/k2-fsa/k2.
You can use https://github.com/k2-fsa/sherpa to deploy models trained with icefall.
Installation
Please refer to https://icefall.readthedocs.io/en/latest/installation/index.html for installation.
Recipes
Please refer to https://icefall.readthedocs.io/en/latest/recipes/index.html for more information.
We provide the following recipes:
yesno
This is the simplest ASR recipe in icefall
and can be run on CPU.
Training takes less than 30 seconds and gives you the following WER:
[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
We do provide a Colab notebook for this recipe.
LibriSpeech
Please see https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/RESULTS.md for the latest results.
We provide 4 models for this recipe:
- conformer CTC model
- TDNN LSTM CTC model
- Transducer: Conformer encoder + LSTM decoder
- Transducer: Conformer encoder + Embedding decoder
Conformer CTC Model
The best WER we currently have is:
test-clean | test-other | |
---|---|---|
WER | 2.42 | 5.73 |
We provide a Colab notebook to run a pre-trained conformer CTC model:
TDNN LSTM CTC Model
The WER for this model is:
test-clean | test-other | |
---|---|---|
WER | 6.59 | 17.69 |
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:
Transducer: Conformer encoder + LSTM decoder
Using Conformer as encoder and LSTM as decoder.
The best WER with greedy search is:
test-clean | test-other | |
---|---|---|
WER | 3.07 | 7.51 |
We provide a Colab notebook to run a pre-trained RNN-T conformer model:
Transducer: Conformer encoder + Embedding decoder
Using Conformer as encoder. The decoder consists of 1 embedding layer and 1 convolutional layer.
The best WER using modified beam search with beam size 4 is:
test-clean | test-other | |
---|---|---|
WER | 2.56 | 6.27 |
Note: No auxiliary losses are used in the training and no LMs are used in the decoding.
We provide a Colab notebook to run a pre-trained transducer conformer + stateless decoder model:
k2 pruned RNN-T
test-clean | test-other | |
---|---|---|
WER | 2.57 | 5.95 |
k2 pruned RNN-T + GigaSpeech
test-clean | test-other | |
---|---|---|
WER | 2.00 | 4.63 |
Aishell
We provide two models for this recipe: conformer CTC model and TDNN LSTM CTC model.
Conformer CTC Model
The best CER we currently have is:
test | |
---|---|
CER | 4.26 |
We provide a Colab notebook to run a pre-trained conformer CTC model: 
Dev | Test-Net | Test-Meeting | |
---|---|---|---|
greedy search | 7.80 | 8.75 | 13.49 |
fast beam search | 7.94 | 8.74 | 13.80 |
modified beam search | 7.76 | 8.71 | 13.41 |
We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model:
Alimeeting
We provide one model for this recipe: Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss.
Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss (trained with far subset)
Eval | Test-Net | |
---|---|---|
greedy search | 31.77 | 34.66 |
fast beam search | 31.39 | 33.02 |
modified beam search | 30.38 | 34.25 |
We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model:
Deployment with C++
Once you have trained a model in icefall, you may want to deploy it with C++, without Python dependencies.
Please refer to the documentation https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html#deployment-with-c for how to do this.
We also provide a Colab notebook, showing you how to run a torch scripted model in k2 with C++.
Please see:
)