diff --git a/README.md b/README.md index b1ec8fc38..217c8e744 100644 --- a/README.md +++ b/README.md @@ -72,6 +72,8 @@ The best WER we currently have is: |-----|------------|------------| | WER | 3.16 | 7.71 | +We provide a Colab notebook to run a pre-trained RNN-T conformer model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1_u6yK9jDkPwG_NLrZMN2XK7Aeq4suMO2?usp=sharing) + ### Aishell diff --git a/egs/librispeech/ASR/RESULTS.md b/egs/librispeech/ASR/RESULTS.md index 8d7c867c0..4e62af2c3 100644 --- a/egs/librispeech/ASR/RESULTS.md +++ b/egs/librispeech/ASR/RESULTS.md @@ -1,5 +1,51 @@ ## Results +### LibriSpeech BPE training results (RNN-T) + +#### 2021-12-17 + +RNN-T + Conformer encoder + +The best WER is + +| | test-clean | test-other | +|-----|------------|------------| +| WER | 3.16 | 7.71 | + +using `--epoch 26 --avg 12` during decoding. + +The training command to reproduce the above WER is: + +``` +export CUDA_VISIBLE_DEVICES="0,1,2,3" + +./transducer/train.py \ + --world-size 4 \ + --num-epochs 30 \ + --start-epoch 0 \ + --exp-dir transducer/exp-lr-2.5-full \ + --full-libri 1 \ + --max-duration 250 \ + --lr-factor 2.5 +``` + +The decoding command is: + +``` +epoch=26 +avg=12 + +./transducer/decode.py \ + --epoch $epoch \ + --avg $avg \ + --exp-dir transducer/exp-lr-2.5-full \ + --bpe-model ./data/lang_bpe_500/bpe.model \ + --max-duration 100 +``` + +You can find the tensorboard log at: + + ### LibriSpeech BPE training results (Conformer-CTC) #### 2021-11-09 diff --git a/egs/librispeech/ASR/transducer/pretrained.py b/egs/librispeech/ASR/transducer/pretrained.py index 9dedfc16f..bb59a7338 100755 --- a/egs/librispeech/ASR/transducer/pretrained.py +++ b/egs/librispeech/ASR/transducer/pretrained.py @@ -276,6 +276,8 @@ def main(): hyp = beam_search( model=model, encoder_out=encoder_out_i, beam=params.beam_size ) + else: + raise ValueError(f"Unsupported method: {params.method}") hyps.append(sp.decode(hyp).split())