Update results.

2025-12-11 06:55:27 +00:00 · 2021-12-27 15:31:35 +08:00 · 2021-12-27 15:31:35 +08:00 · 8eea3414a4
commit 8eea3414a4
parent b5735ae16f
4 changed files with 23 additions and 27 deletions
--- a/.github/workflows/run-pretrained-transducer-stateless.yml
+++ b/.github/workflows/run-pretrained-transducer-stateless.yml
@ -74,11 +74,11 @@ jobs:
          mkdir tmp
          cd tmp
          git lfs install
-          git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22
+          git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27
          cd ..
          tree tmp
-          soxi tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/*.wav
-          ls -lh tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/*.wav
+          soxi tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/*.wav
+          ls -lh tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/*.wav

      - name: Run greedy search decoding
        shell: bash
@ -87,11 +87,11 @@ jobs:
          cd egs/librispeech/ASR
          ./transducer_stateless/pretrained.py \
            --method greedy_search \
-            --checkpoint ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/exp/pretrained.pt \
-            --bpe-model ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/data/lang_bpe_500/bpe.model \
-            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/1089-134686-0001.wav \
-            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/1221-135766-0001.wav \
-            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/1221-135766-0002.wav
+            --checkpoint ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/exp/pretrained.pt \
+            --bpe-model ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/data/lang_bpe_500/bpe.model \
+            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/1089-134686-0001.wav \
+            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/1221-135766-0001.wav \
+            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/1221-135766-0002.wav

      - name: Run beam search decoding
        shell: bash
@ -101,8 +101,8 @@ jobs:
          ./transducer_stateless/pretrained.py \
            --method beam_search \
            --beam-size 4 \
-            --checkpoint ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/exp/pretrained.pt \
-            --bpe-model ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/data/lang_bpe_500/bpe.model \
-            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/1089-134686-0001.wav \
-            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/1221-135766-0001.wav \
-            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-22/test_wavs/1221-135766-0002.wav
+            --checkpoint ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/exp/pretrained.pt \
+            --bpe-model ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/data/lang_bpe_500/bpe.model \
+            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/1089-134686-0001.wav \
+            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/1221-135766-0001.wav \
+            ./tmp/icefall-asr-librispeech-transducer-stateless-bpe-500-2021-12-27/test_wavs/1221-135766-0002.wav
--- a/README.md
+++ b/README.md
@ -84,7 +84,7 @@ The best WER using beam search with beam size 4 is:

 |     | test-clean | test-other |
 |-----|------------|------------|
-| WER | 2.92       | 7.37       |
+| WER | 2.83       | 7.19       |

 Note: No auxiliary losses are used in the training and no LMs are used
 in the decoding.
--- a/egs/librispeech/ASR/RESULTS.md
+++ b/egs/librispeech/ASR/RESULTS.md
@ -4,7 +4,7 @@

 #### Conformer encoder + embedding decoder

-Using commit `fb6a57e9e01dd8aae2af2a6b4568daad8bc8ab32`.
+Using commit `TODO`.

 Conformer encoder + non-current decoder. The decoder
 contains only an embedding layer and a Conv1d (with kernel size 2).
@ -13,12 +13,8 @@ The WERs are

 |                           | test-clean | test-other | comment                                  |
 |---------------------------|------------|------------|------------------------------------------|
-| greedy search             | 2.99       | 7.52       | --epoch 20, --avg 10, --max-duration 100 |
-| beam search (beam size 2) | 2.95       | 7.43       |                                          |
-| beam search (beam size 3) | 2.94       | 7.37       |                                          |
-| beam search (beam size 4) | 2.92       | 7.37       |                                          |
-| beam search (beam size 5) | 2.93       | 7.38       |                                          |
-| beam search (beam size 8) | 2.92       | 7.38       |                                          |
+| greedy search             | 2.85       | 7.30       | --epoch 29, --avg 13, --max-duration 100 |
+| beam search (beam size 4) | 2.83       | 7.19       |                                          |

 The training command for reproducing is given below:

@ -36,12 +32,12 @@ export CUDA_VISIBLE_DEVICES="0,1,2,3"
 ```

 The tensorboard training log can be found at
-<https://tensorboard.dev/experiment/PsJ3LgkEQfOmzedAlYfVeg/#scalars&_smoothingWeight=0>
+<https://tensorboard.dev/experiment/Mjx7MeTgR3Oyr1yBCwjozw/>

 The decoding command is:
 ```
-epoch=20
-avg=10
+epoch=29
+avg=13

 ## greedy search
 ./transducer_stateless/decode.py \
@ -64,7 +60,7 @@ avg=10


 #### Conformer encoder + LSTM decoder
-Using commit `TODO`.
+Using commit `8187d6236c2926500da5ee854f758e621df803cc`.

 Conformer encoder + LSTM decoder.

--- a/egs/librispeech/ASR/transducer_stateless/decode.py
+++ b/egs/librispeech/ASR/transducer_stateless/decode.py
@ -70,14 +70,14 @@ def get_parser():
    parser.add_argument(
        "--epoch",
        type=int,
-        default=20,
+        default=29,
        help="It specifies the checkpoint to use for decoding."
        "Note: Epoch counts from 0.",
    )
    parser.add_argument(
        "--avg",
        type=int,
-        default=10,
+        default=13,
        help="Number of checkpoints to average. Automatically select "
        "consecutive checkpoints before the checkpoint specified by "
        "'--epoch'. ",