mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-08 09:32:20 +00:00
Release v0.1 (#26)
This commit is contained in:
parent
f4223ee110
commit
96e7f5c7ea
107
README.md
107
README.md
@ -1,80 +1,61 @@
|
||||
|
||||
# Table of Contents
|
||||
|
||||
- [Installation](#installation)
|
||||
* [Install k2](#install-k2)
|
||||
* [Install lhotse](#install-lhotse)
|
||||
* [Install icefall](#install-icefall)
|
||||
- [Run recipes](#run-recipes)
|
||||
<div align="center">
|
||||
<img src="https://raw.githubusercontent.com/k2-fsa/icefall/master/docs/source/_static/logo.png" width=168>
|
||||
</div>
|
||||
|
||||
## Installation
|
||||
|
||||
`icefall` depends on [k2][k2] for FSA operations and [lhotse][lhotse] for
|
||||
data preparations. To use `icefall`, you have to install its dependencies first.
|
||||
The following subsections describe how to setup the environment.
|
||||
|
||||
CAUTION: There are various ways to setup the environment. What we describe
|
||||
here is just one alternative.
|
||||
|
||||
### Install k2
|
||||
|
||||
Please refer to [k2's installation documentation][k2-install] to install k2.
|
||||
If you have any issues about installing k2, please open an issue at
|
||||
<https://github.com/k2-fsa/k2/issues>.
|
||||
|
||||
### Install lhotse
|
||||
|
||||
Please refer to [lhotse's installation documentation][lhotse-install] to install
|
||||
lhotse.
|
||||
|
||||
### Install icefall
|
||||
|
||||
`icefall` is a set of Python scripts. What you need to do is just to set
|
||||
the environment variable `PYTHONPATH`:
|
||||
|
||||
```bash
|
||||
cd $HOME/open-source
|
||||
git clone https://github.com/k2-fsa/icefall
|
||||
cd icefall
|
||||
pip install -r requirements.txt
|
||||
export PYTHONPATH=$HOME/open-source/icefall:$PYTHONPATHON
|
||||
```
|
||||
|
||||
To verify `icefall` was installed successfully, you can run:
|
||||
|
||||
```bash
|
||||
python3 -c "import icefall; print(icefall.__file__)"
|
||||
```
|
||||
|
||||
It should print the path to `icefall`.
|
||||
Please refer to <https://icefall.readthedocs.io/en/latest/installation/index.html>
|
||||
for installation.
|
||||
|
||||
## Recipes
|
||||
|
||||
At present, two recipes are provided:
|
||||
Please refer to <https://icefall.readthedocs.io/en/latest/recipes/index.html>
|
||||
for more information.
|
||||
|
||||
- [LibriSpeech][LibriSpeech]
|
||||
- [yesno][yesno] [](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
|
||||
We provide two recipes at present:
|
||||
|
||||
### Yesno
|
||||
- [yesno][yesno]
|
||||
- [LibriSpeech][librispeech]
|
||||
|
||||
For the yesno recipe, training with 50 epochs takes less than 2 minutes using **CPU**.
|
||||
### yesno
|
||||
|
||||
The WER is
|
||||
This is the simplest ASR recipe in `icefall` and can be run on CPU.
|
||||
Training takes less than 30 seconds and gives you the following WER:
|
||||
|
||||
```
|
||||
[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
|
||||
```
|
||||
We do provide a Colab notebook for this recipe.
|
||||
|
||||
## Use Pre-trained models
|
||||
|
||||
See [egs/librispeech/ASR/conformer_ctc/README.md](egs/librispeech/ASR/conformer_ctc/README.md)
|
||||
for how to use pre-trained models.
|
||||
[](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing)
|
||||
[](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
|
||||
|
||||
|
||||
[yesno]: egs/yesno/ASR/README.md
|
||||
[LibriSpeech]: egs/librispeech/ASR/README.md
|
||||
[k2-install]: https://k2.readthedocs.io/en/latest/installation/index.html#
|
||||
[k2]: https://github.com/k2-fsa/k2
|
||||
[lhotse]: https://github.com/lhotse-speech/lhotse
|
||||
[lhotse-install]: https://lhotse.readthedocs.io/en/latest/getting-started.html#installation
|
||||
### LibriSpeech
|
||||
|
||||
We provide two models for this recipe: [conformer CTC model][LibriSpeech_conformer_ctc]
|
||||
and [TDNN LSTM CTC model][LibriSpeech_tdnn_lstm_ctc].
|
||||
|
||||
#### Conformer CTC Model
|
||||
|
||||
The best WER we currently have is:
|
||||
|
||||
||test-clean|test-other|
|
||||
|--|--|--|
|
||||
|WER| 2.57% | 5.94% |
|
||||
|
||||
We provide a Colab notebook to run a pre-trained conformer CTC model: [](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing)
|
||||
|
||||
#### TDNN LSTM CTC Model
|
||||
|
||||
The WER for this model is:
|
||||
|
||||
||test-clean|test-other|
|
||||
|--|--|--|
|
||||
|WER| 6.59% | 17.69% |
|
||||
|
||||
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: [](https://colab.research.google.com/drive/1kNmDXNMwREi0rZGAOIAOJo93REBuOTcd?usp=sharing)
|
||||
|
||||
[LibriSpeech_tdnn_lstm_ctc]: egs/librispeech/ASR/tdnn_lstm_ctc
|
||||
[LibriSpeech_conformer_ctc]: egs/librispeech/ASR/conformer_ctc
|
||||
[yesno]: egs/yesno/ASR
|
||||
[librispeech]: egs/librispeech/ASR
|
||||
|
@ -1,64 +1,3 @@
|
||||
|
||||
## Data preparation
|
||||
|
||||
If you want to use `./prepare.sh` to download everything for you,
|
||||
you can just run
|
||||
|
||||
```
|
||||
./prepare.sh
|
||||
```
|
||||
|
||||
If you have pre-downloaded the LibriSpeech dataset, please
|
||||
read `./prepare.sh` and modify it to point to the location
|
||||
of your dataset so that it won't re-download it. After modification,
|
||||
please run
|
||||
|
||||
```
|
||||
./prepare.sh
|
||||
```
|
||||
|
||||
The script `./prepare.sh` prepares features, lexicon, LMs, etc.
|
||||
All generated files are saved in the folder `./data`.
|
||||
|
||||
**HINT:** `./prepare.sh` supports options `--stage` and `--stop-stage`.
|
||||
|
||||
## TDNN-LSTM CTC training
|
||||
|
||||
The folder `tdnn_lstm_ctc` contains scripts for CTC training
|
||||
with TDNN-LSTM models.
|
||||
|
||||
Pre-configured parameters for training and decoding are set in the function
|
||||
`get_params()` within `tdnn_lstm_ctc/train.py`
|
||||
and `tdnn_lstm_ctc/decode.py`.
|
||||
|
||||
Parameters that can be passed from the command-line can be found by
|
||||
|
||||
```
|
||||
./tdnn_lstm_ctc/train.py --help
|
||||
./tdnn_lstm_ctc/decode.py --help
|
||||
```
|
||||
|
||||
If you have 4 GPUs on a machine and want to use GPU 0, 2, 3 for
|
||||
mutli-GPU training, you can run
|
||||
|
||||
```
|
||||
export CUDA_VISIBLE_DEVICES="0,2,3"
|
||||
./tdnn_lstm_ctc/train.py \
|
||||
--master-port 12345 \
|
||||
--world-size 3
|
||||
```
|
||||
|
||||
If you want to decode by averaging checkpoints `epoch-8.pt`,
|
||||
`epoch-9.pt` and `epoch-10.pt`, you can run
|
||||
|
||||
```
|
||||
./tdnn_lstm_ctc/decode.py \
|
||||
--epoch 10 \
|
||||
--avg 3
|
||||
```
|
||||
|
||||
## Conformer CTC training
|
||||
|
||||
The folder `conformer-ctc` contains scripts for CTC training
|
||||
with conformer models. The steps of running the training and
|
||||
decoding are similar to `tdnn_lstm_ctc`.
|
||||
Please refer to <https://icefall.readthedocs.io/en/latest/recipes/librispeech.html>
|
||||
for how to run models in this recipe.
|
||||
|
Loading…
x
Reference in New Issue
Block a user