icefall

History

Sort results to make it more convenient to compare decoding results (#522 )

* Sort result to make it more convenient to compare decoding results

* Add cut_id to recognition results

* add cut_id to results for all recipes

* Fix torch.jit.script

* Fix comments

* Minor fixes

* Fix torch.jit.tracing for Pytorch version before v1.9.0

2022-08-12 07:12:50 +08:00

__init__.py

Begin to use multiple datasets in training (#213 )

2022-02-21 15:27:27 +08:00

asr_datamodule.py

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

beam_search.py

Fix joiner (#234 )

2022-03-02 16:41:14 +08:00

conformer.py

Fix joiner (#234 )

2022-03-02 16:41:14 +08:00

decode.py

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

decoder.py

Fix joiner (#234 )

2022-03-02 16:41:14 +08:00

encoder_interface.py

Fix joiner (#234 )

2022-03-02 16:41:14 +08:00

export.py

Various fixes to support torch script. (#371 )

2022-05-16 21:46:59 +08:00

gigaspeech.py

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

joiner.py

Fix joiner (#234 )

2022-03-02 16:41:14 +08:00

librispeech.py

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

model.py

Begin to use multiple datasets in training (#213 )

2022-02-21 15:27:27 +08:00

pretrained.py

Ignore padding frames during RNN-T decoding. (#358 )

2022-05-13 07:39:14 +08:00

README.md

Begin to use multiple datasets in training (#213 )

2022-02-21 15:27:27 +08:00

subsampling.py

Begin to use multiple datasets in training (#213 )

2022-02-21 15:27:27 +08:00

test_asr_datamodule.py

Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )

2022-06-09 11:42:18 +08:00

test_decoder.py

Begin to use multiple datasets in training (#213 )

2022-02-21 15:27:27 +08:00

train.py

fix about tensorboard (#516 )

2022-08-04 19:57:12 +08:00

transformer.py

Fix joiner (#234 )

2022-03-02 16:41:14 +08:00

README.md

Introduction

The decoder, i.e., the prediction network, is from https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9054419 (Rnn-Transducer with Stateless Prediction Network)

You can use the following command to start the training:

cd egs/librispeech/ASR
./prepare.sh
./prepare_giga_speech.sh

export CUDA_VISIBLE_DEVICES="0,1"

./transducer_stateless_multi_datasets/train.py \
  --world-size 2 \
  --num-epochs 60 \
  --start-epoch 0 \
  --exp-dir transducer_stateless_multi_datasets/exp-100 \
  --full-libri 0 \
  --max-duration 300 \
  --lr-factor 1 \
  --bpe-model data/lang_bpe_500/bpe.model \
  --modified-transducer-prob 0.25
  --giga-prob 0.2