icefall/README.md
2021-08-24 21:30:30 +08:00

62 lines
1.9 KiB
Markdown

<div align="center">
<img src="https://raw.githubusercontent.com/k2-fsa/icefall/master/docs/source/_static/logo.png" width=168>
</div>
## Installation
Please refer to <https://icefall.readthedocs.io/en/latest/installation/index.html>
for installation.
## Recipes
Please refer to <https://icefall.readthedocs.io/en/latest/recipes/index.html>
for more information.
We provide two recipes at present:
- [yesno][yesno]
- [LibriSpeech][librispeech]
### yesno
This is the simplest ASR recipe in `icefall` and can be run on CPU.
Training takes less than 30 seconds and gives you the following WER:
```
[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
```
We do provide a Colab notebook for this recipe.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
### LibriSpeech
We provide two models for this recipe: [conformer CTC model][LibriSpeech_conformer_ctc]
and [TDNN LSTM CTC model][LibriSpeech_tdnn_lstm_ctc].
#### Conformer CTC Model
The best WER we currently have is:
||test-clean|test-other|
|--|--|--|
|WER| 2.57% | 5.94% |
We provide a Colab notebook to run a pre-trained conformer CTC model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing)
#### TDNN LSTM CTC Model
The WER for this model is:
||test-clean|test-other|
|--|--|--|
|WER| 6.59% | 17.69% |
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kNmDXNMwREi0rZGAOIAOJo93REBuOTcd?usp=sharing)
[LibriSpeech_tdnn_lstm_ctc]: egs/librispeech/ASR/tdnn_lstm_ctc
[LibriSpeech_conformer_ctc]: egs/librispeech/ASR/conformer_ctc
[yesno]: egs/yesno/ASR
[librispeech]: egs/librispeech/ASR