This repository has been archived on 2026-03-23. You can view files and clone it, but cannot push or open issues or pull requests.
Fangjun Kuang 65a2275312 Set the seed for dataloader.
Also, suppress torch warnings about division by truncation.
2022-03-31 12:09:03 +08:00
2022-03-31 12:09:03 +08:00
2022-03-30 14:52:55 +08:00
2021-12-07 21:44:37 +08:00
2022-03-18 11:39:06 +08:00
2021-07-24 17:13:20 +08:00
2021-07-15 17:36:48 +08:00
2021-07-15 17:36:48 +08:00
2021-10-01 16:43:08 +08:00

Installation

Please refer to https://icefall.readthedocs.io/en/latest/installation/index.html for installation.

Recipes

Please refer to https://icefall.readthedocs.io/en/latest/recipes/index.html for more information.

We provide four recipes at present:

yesno

This is the simplest ASR recipe in icefall and can be run on CPU. Training takes less than 30 seconds and gives you the following WER:

[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]

We do provide a Colab notebook for this recipe.

Open In Colab

LibriSpeech

We provide 4 models for this recipe:

Conformer CTC Model

The best WER we currently have is:

test-clean test-other
WER 2.42 5.73

We provide a Colab notebook to run a pre-trained conformer CTC model: Open In Colab

TDNN LSTM CTC Model

The WER for this model is:

test-clean test-other
WER 6.59 17.69

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: Open In Colab

Transducer: Conformer encoder + LSTM decoder

Using Conformer as encoder and LSTM as decoder.

The best WER with greedy search is:

test-clean test-other
WER 3.07 7.51

We provide a Colab notebook to run a pre-trained RNN-T conformer model: Open In Colab

Transducer: Conformer encoder + Embedding decoder

Using Conformer as encoder. The decoder consists of 1 embedding layer and 1 convolutional layer.

The best WER using modified beam search with beam size 4 is:

test-clean test-other
WER 2.56 6.27

Note: No auxiliary losses are used in the training and no LMs are used in the decoding.

We provide a Colab notebook to run a pre-trained transducer conformer + stateless decoder model: Open In Colab

Aishell

We provide two models for this recipe: conformer CTC model and TDNN LSTM CTC model.

Conformer CTC Model

The best CER we currently have is:

test
CER 4.26

We provide a Colab notebook to run a pre-trained conformer CTC model: Open In Colab

Transducer Stateless Model

The best CER we currently have is:

test
CER 4.68

We provide a Colab notebook to run a pre-trained TransducerStateless model: Open In Colab

TDNN LSTM CTC Model

The CER for this model is:

test
CER 10.16

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: Open In Colab

TIMIT

We provide two models for this recipe: TDNN LSTM CTC model and TDNN LiGRU CTC model.

TDNN LSTM CTC Model

The best PER we currently have is:

TEST
PER 19.71%

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: Open In Colab

TDNN LiGRU CTC Model

The PER for this model is:

TEST
PER 17.66%

We provide a Colab notebook to run a pre-trained TDNN LiGRU CTC model: Open In Colab

Deployment with C++

Once you have trained a model in icefall, you may want to deploy it with C++, without Python dependencies.

Please refer to the documentation https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html#deployment-with-c for how to do this.

We also provide a Colab notebook, showing you how to run a torch scripted model in k2 with C++. Please see: Open In Colab

Description
No description provided
Readme Apache-2.0 56 MiB
Languages
Python 97.9%
Shell 1.9%
Dockerfile 0.2%