Update results

2022-04-11 21:46:40 +00:00 · 2022-04-11 21:46:40 +00:00 · 22f011e5ab
commit 22f011e5ab
parent f485b66d54
2 changed files with 35 additions and 8 deletions
--- a/egs/gigaspeech/ASR/README.md
+++ b/egs/gigaspeech/ASR/README.md
@ -15,6 +15,6 @@ ln -sfv /path/to/GigaSpeech download/GigaSpeech
 ## Performance Record
 |     |  Dev  | Test  |
 |-----|-------|-------|
-| WER | 11.93 | 11.86 |
+| WER | 10.47 | 10.58 |

 See [RESULTS](/egs/gigaspeech/ASR/RESULTS.md) for details.
--- a/egs/gigaspeech/ASR/RESULTS.md
+++ b/egs/gigaspeech/ASR/RESULTS.md
@ -5,22 +5,23 @@
 #### 2022-04-06

 The best WER, as of 2022-04-06, for the gigaspeech is below
-(using HLG decoding + n-gram LM rescoring + attention decoder rescoring):
+
+Results using HLG decoding + n-gram LM rescoring + attention decoder rescoring:

 |     |  Dev  | Test  |
 |-----|-------|-------|
-| WER | 11.93 | 11.86 |
+| WER | 10.47 | 10.58 |

 Scale values used in n-gram LM rescoring and attention rescoring for the best WERs are:
 | ngram_lm_scale | attention_scale |
 |----------------|-----------------|
-|      0.3       |        1.5      |
+|      0.5       |       1.3       |


 To reproduce the above result, use the following commands for training:

 ```
-cd egs/gigaspeech/ASR/conformer_ctc
+cd egs/gigaspeech/ASR
 ./prepare.sh
 export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
 ./conformer_ctc/train.py \
@ -31,12 +32,12 @@ export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
  --lang-dir data/lang_bpe_500
 ```

-and the following command for decoding
+and the following command for decoding:

 ```
 ./conformer_ctc/decode.py \
-  --epoch 19 \
-  --avg 8 \
+  --epoch 18 \
+  --avg 6 \
  --method attention-decoder \
  --num-paths 1000 \
  --exp-dir conformer_ctc/exp_500 \
@ -47,3 +48,29 @@ and the following command for decoding

 The tensorboard log for training is available at
 <https://tensorboard.dev/experiment/rz63cmJXSK2fV9GceJtZXQ/>
+
+Results using HLG decoding + whole lattice rescoring:
+
+|     |  Dev  | Test  |
+|-----|-------|-------|
+| WER | 10.51 | 10.62 |
+
+Scale values used in n-gram LM rescoring and attention rescoring for the best WERs are:
+| lm_scale |
+|----------|
+|   0.2    |
+
+To reproduce the above result, use the training commands above, and the following command for decoding:
+
+```
+./conformer_ctc/decode.py \
+  --epoch 18 \
+  --avg 6 \
+  --method whole-lattice-rescoring \
+  --num-paths 1000 \
+  --exp-dir conformer_ctc/exp_500 \
+  --lang-dir data/lang_bpe_500 \
+  --max-duration 20 \
+  --num-workers 1
+```
+Note: the `whole-lattice-rescoring` method is about twice as fast as the `attention-decoder` method, with slightly worse WER.