Update README.md (#1043)

* Update README.md
2025-12-11 06:55:27 +00:00 · 2023-05-08 16:59:05 +08:00 · 2023-05-08 16:59:05 +08:00 · 24b50a5bad
commit 24b50a5bad
parent efbb577b88
1 changed files with 92 additions and 59 deletions
--- a/README.md
+++ b/README.md
@ -28,14 +28,15 @@ We provide the following recipes:
  - [yesno][yesno]
  - [LibriSpeech][librispeech]
  - [GigaSpeech][gigaspeech]
  - [Aishell][aishell]
  - [Aishell2][aishell2]
  - [Aishell4][aishell4]
  - [TIMIT][timit]
  - [TED-LIUM3][tedlium3]
  - [GigaSpeech][gigaspeech]
  - [Aidatatang_200zh][aidatatang_200zh]
  - [WenetSpeech][wenetspeech]
  - [Alimeeting][alimeeting]
  - [Aishell4][aishell4]
  - [TAL_CSASR][tal_csasr]
 ### yesno
@ -46,9 +47,7 @@ Training takes less than 30 seconds and gives you the following WER:
 ```
 [test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
 ```
-We do provide a Colab notebook for this recipe.
+We provide a Colab notebook for this recipe: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
 ### LibriSpeech
@ -118,19 +117,54 @@ We provide a Colab notebook to run a pre-trained transducer conformer + stateles
 |     | test-clean | test-other |
 |-----|------------|------------|
-| WER | 2.57       | 5.95       |
+| WER | 2.15       | 5.20       |
 Note: No auxiliary losses are used in the training and no LMs are used
 in the decoding.
 #### k2 pruned RNN-T + GigaSpeech
 |     | test-clean | test-other |
 |-----|------------|------------|
-| WER | 2.00       | 4.63       |
+| WER | 1.78       | 4.08       |
 Note: No auxiliary losses are used in the training and no LMs are used
 in the decoding.
 #### k2 pruned RNN-T + GigaSpeech + CommonVoice
 |     | test-clean | test-other |
 |-----|------------|------------|
 | WER | 1.90       | 3.98       |
 Note: No auxiliary losses are used in the training and no LMs are used
 in the decoding.
 ### GigaSpeech
 We provide two models for this recipe: [Conformer CTC model][GigaSpeech_conformer_ctc]
 and [Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][GigaSpeech_pruned_transducer_stateless2].
 #### Conformer CTC
 |     |  Dev  | Test  |
 |-----|-------|-------|
 | WER | 10.47 | 10.58 |
 #### Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss
 |                      |  Dev  | Test  |
 |----------------------|-------|-------|
 |    greedy search     | 10.51 | 10.73 |
 |   fast beam search   | 10.50 | 10.69 |
 | modified beam search | 10.40 | 10.51 |
 ### Aishell
-We provide two models for this recipe: [conformer CTC model][Aishell_conformer_ctc]
+We provide three models for this recipe: [conformer CTC model][Aishell_conformer_ctc],
-and [TDNN LSTM CTC model][Aishell_tdnn_lstm_ctc].
+[TDNN LSTM CTC model][Aishell_tdnn_lstm_ctc], and [Transducer Stateless Model][Aishell_pruned_transducer_stateless7],
 #### Conformer CTC Model
@ -140,20 +174,6 @@ The best CER we currently have is:
 |-----|------|
 | CER | 4.26 |
 We provide a Colab notebook to run a pre-trained conformer CTC model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg](https://colab.research.google.com/drive/1WnG17io5HEZ0Gn_cnh_VzK5QYOoiiklC?usp=sharing)
 #### Transducer Stateless Model
 The best CER we currently have is:
 |     | test |
 |-----|------|
 | CER | 4.68 |
 We provide a Colab notebook to run a pre-trained TransducerStateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/14XaT2MhnBkK-3_RqqWq3K90Xlbin-GZC?usp=sharing)
 #### TDNN LSTM CTC Model
 The CER for this model is:
@ -164,6 +184,46 @@ The CER for this model is:
 We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jbyzYq3ytm6j2nlEt-diQm-6QVWyDDEa?usp=sharing)
 #### Transducer Stateless Model
 The best CER we currently have is:
 |     | test |
 |-----|------|
 | CER | 4.38 |
 We provide a Colab notebook to run a pre-trained TransducerStateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/14XaT2MhnBkK-3_RqqWq3K90Xlbin-GZC?usp=sharing)
 ### Aishell2
 We provide one model for this recipe: [Transducer Stateless Model][Aishell2_pruned_transducer_stateless5].
 #### Transducer Stateless Model
 The best WER we currently have is:
 |     |   dev-ios  |  test-ios  |
 |-----|------------|------------|
 | WER |    5.32    |    5.56    |
 ### Aishell4
 We provide one model for this recipe: [Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][Aishell4_pruned_transducer_stateless5].
 #### Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss (trained with all subsets)
 The best CER we currently have is:
 |     |   test     |
 |-----|------------|
 | CER |   29.08    |
 We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1z3lkURVv9M7uTiIgf3Np9IntMHEknaks?usp=sharing)
 ### TIMIT
 We provide two models for this recipe: [TDNN LSTM CTC model][TIMIT_tdnn_lstm_ctc]
@ -187,7 +247,8 @@ The PER for this model is:
 |--|--|
 |PER| 17.66% |
-We provide a Colab notebook to run a pre-trained TDNN LiGRU CTC model:  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11IT-k4HQIgQngXz1uvWsEYktjqQt7Tmb?usp=sharing)
+We provide a Colab notebook to run a pre-trained TDNN LiGRU CTC model:  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1z3lkURVv9M7uTiIgf3Np9IntMHEknaks?usp=sharing)
 ### TED-LIUM3
@ -215,24 +276,6 @@ The best WER using modified beam search with beam size 4 is:
 We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1je_1zGrOkGVVd4WLzgkXRHxl-I27yWtz?usp=sharing)
 ### GigaSpeech
 We provide two models for this recipe: [Conformer CTC model][GigaSpeech_conformer_ctc]
 and [Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][GigaSpeech_pruned_transducer_stateless2].
 #### Conformer CTC
 |     |  Dev  | Test  |
 |-----|-------|-------|
 | WER | 10.47 | 10.58 |
 #### Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss
 |                      |  Dev  | Test  |
 |----------------------|-------|-------|
 |    greedy search     | 10.51 | 10.73 |
 |   fast beam search   | 10.50 | 10.69 |
 | modified beam search | 10.40 | 10.51 |
 ### Aidatatang_200zh
@ -248,6 +291,7 @@ We provide one model for this recipe: [Pruned stateless RNN-T: Conformer encoder
 We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1wNSnSj3T5oOctbh5IGCa393gKOoQw2GH?usp=sharing)
 ### WenetSpeech
 We provide some models for this recipe: [Pruned stateless RNN-T_2: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][WenetSpeech_pruned_transducer_stateless2] and [Pruned stateless RNN-T_5: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][WenetSpeech_pruned_transducer_stateless5].
@ -284,20 +328,6 @@ We provide one model for this recipe: [Pruned stateless RNN-T: Conformer encoder
 We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tKr3f0mL17uO_ljdHGKtR7HOmthYHwJG?usp=sharing)
 ### Aishell4
 We provide one model for this recipe: [Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][Aishell4_pruned_transducer_stateless5].
 #### Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss (trained with all subsets)
 The best CER(%) results:
 |                      |  test  |
 |----------------------|--------|
 |    greedy search     | 29.89  |
 |   fast beam search   | 28.91  |
 | modified beam search | 29.08  |
 We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1z3lkURVv9M7uTiIgf3Np9IntMHEknaks?usp=sharing)
 ### TAL_CSASR
@ -333,6 +363,9 @@ Please see: [![Open In Colab](https://colab.research.google.com/assets/colab-bad
 [LibriSpeech_transducer_stateless]: egs/librispeech/ASR/transducer_stateless
 [Aishell_tdnn_lstm_ctc]: egs/aishell/ASR/tdnn_lstm_ctc
 [Aishell_conformer_ctc]: egs/aishell/ASR/conformer_ctc
 [Aishell_pruned_transducer_stateless7]: egs/aishell/ASR/pruned_transducer_stateless7_bbpe
 [Aishell2_pruned_transducer_stateless5]: egs/aishell2/ASR/pruned_transducer_stateless5
 [Aishell4_pruned_transducer_stateless5]: egs/aishell4/ASR/pruned_transducer_stateless5
 [TIMIT_tdnn_lstm_ctc]: egs/timit/ASR/tdnn_lstm_ctc
 [TIMIT_tdnn_ligru_ctc]: egs/timit/ASR/tdnn_ligru_ctc
 [TED-LIUM3_transducer_stateless]: egs/tedlium3/ASR/transducer_stateless
@ -343,17 +376,17 @@ Please see: [![Open In Colab](https://colab.research.google.com/assets/colab-bad
 [WenetSpeech_pruned_transducer_stateless2]: egs/wenetspeech/ASR/pruned_transducer_stateless2
 [WenetSpeech_pruned_transducer_stateless5]: egs/wenetspeech/ASR/pruned_transducer_stateless5
 [Alimeeting_pruned_transducer_stateless2]: egs/alimeeting/ASR/pruned_transducer_stateless2
 [Aishell4_pruned_transducer_stateless5]: egs/aishell4/ASR/pruned_transducer_stateless5
 [TAL_CSASR_pruned_transducer_stateless5]: egs/tal_csasr/ASR/pruned_transducer_stateless5
 [yesno]: egs/yesno/ASR
 [librispeech]: egs/librispeech/ASR
 [aishell]: egs/aishell/ASR
 [aishell2]: egs/aishell2/ASR
 [aishell4]: egs/aishell4/ASR
 [timit]: egs/timit/ASR
 [tedlium3]: egs/tedlium3/ASR
 [gigaspeech]: egs/gigaspeech/ASR
 [aidatatang_200zh]: egs/aidatatang_200zh/ASR
 [wenetspeech]: egs/wenetspeech/ASR
 [alimeeting]: egs/alimeeting/ASR
 [aishell4]: egs/aishell4/ASR
 [tal_csasr]: egs/tal_csasr/ASR
 [k2]: https://github.com/k2-fsa/k2