From ffb25d12dcfda20f68459eb3915939f94f9c9104 Mon Sep 17 00:00:00 2001
From: luomingshuang <739314837@qq.com>
Date: Wed, 27 Jul 2022 16:41:23 +0800
Subject: [PATCH] add README.md and RESULTS.md

---
 egs/wenetspeech/ASR/README.md  |  1 +
 egs/wenetspeech/ASR/RESULTS.md | 78 ++++++++++++++++++++++++++++++++--
 2 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/egs/wenetspeech/ASR/README.md b/egs/wenetspeech/ASR/README.md
index c92f1b4e6..44e631b4a 100644
--- a/egs/wenetspeech/ASR/README.md
+++ b/egs/wenetspeech/ASR/README.md
@@ -13,6 +13,7 @@ The following table lists the differences among them.
 |                                       | Encoder             | Decoder            | Comment                     |
 |---------------------------------------|---------------------|--------------------|-----------------------------|
 | `pruned_transducer_stateless2`        | Conformer(modified) | Embedding + Conv1d | Using k2 pruned RNN-T loss  |                      |
+| `pruned_transducer_stateless5`        | Conformer(modified) | Embedding + Conv1d | Using k2 pruned RNN-T loss  |                      |
 
 The decoder in `transducer_stateless` is modified from the paper
 [Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).
diff --git a/egs/wenetspeech/ASR/RESULTS.md b/egs/wenetspeech/ASR/RESULTS.md
index ea6658ddb..cc36ae4f2 100644
--- a/egs/wenetspeech/ASR/RESULTS.md
+++ b/egs/wenetspeech/ASR/RESULTS.md
@@ -1,12 +1,84 @@
 ## Results
 
+### WenetSpeech char-based training results (offline and streaming) (Pruned Transducer 5)
+
+#### 2022-07-22
+
+Using the codes from this PR https://github.com/k2-fsa/icefall/pull/447.
+
+When training with the L subset, the CERs are
+
+**Offline**:
+|decoding-method| epoch | avg | use-averaged-model | DEV | TEST-NET | TEST-MEETING|
+|-- | -- | -- | -- | -- | -- | --|
+|greedy_search | 4 | 1 | True | 8.22 | 9.03 | 14.54|
+|modified_beam_search | 4 | 1 | True | **8.17** | **9.04** | **14.44**|
+|fast_beam_search | 4 | 1 | True | 8.29 | 9.00 | 14.93|
+
+The offline training command for reproducing is given below:
+```
+export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
+
+./pruned_transducer_stateless5/train.py \
+  --lang-dir data/lang_char \
+  --exp-dir pruned_transducer_stateless5/exp_L_offline \
+  --world-size 8 \
+  --num-epochs 15 \
+  --start-epoch 2 \
+  --max-duration 120 \
+  --valid-interval 3000 \
+  --model-warm-step 3000 \
+  --save-every-n 8000 \
+  --average-period 1000 \
+  --training-subset L
+```
+
+The tensorboard training log can be found at https://tensorboard.dev/experiment/SvnN2jfyTB2Hjqu22Z7ZoQ/#scalars .
+
+
+A pre-trained offline model and decoding logs can be found at <https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless5_offline>
+
+**Streaming**:
+|decoding-method| epoch | avg | use-averaged-model | DEV | TEST-NET | TEST-MEETING|
+|--|--|--|--|--|--|--|
+| greedy_search | 7| 1| True | 8.78 | 10.12 | 16.16 |
+| modified_beam_search | 7| 1| True| **8.53**| **9.95** | **15.81** |
+| fast_beam_search | 7 | 1| True | 9.01 | 10.47 | 16.28 |
+
+The streaming training command for reproducing is given below:
+```
+export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
+
+./pruned_transducer_stateless5/train.py \
+  --lang-dir data/lang_char \
+  --exp-dir pruned_transducer_stateless5/exp_L_streaming \
+  --world-size 8 \
+  --num-epochs 15 \
+  --start-epoch 1 \
+  --max-duration 140 \
+  --valid-interval 3000 \
+  --model-warm-step 3000 \
+  --save-every-n 8000 \
+  --average-period 1000 \
+  --training-subset L \
+  --dynamic-chunk-training True \
+  --causal-convolution True \
+  --short-chunk-size 25 \
+  --num-left-chunks 4
+```
+
+The tensorboard training log can be found at https://tensorboard.dev/experiment/E2NXPVflSOKWepzJ1a1uDQ/#scalars .
+
+
+A pre-trained offline model and decoding logs can be found at <https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless5_streaming>
+
 ### WenetSpeech char-based training results (Pruned Transducer 2)
 
 #### 2022-05-19
 
 Using the codes from this PR https://github.com/k2-fsa/icefall/pull/349.
 
-When training with the L subset, the WERs are
+When training with the L subset, the CERs are
 
 |                                    |  dev  | test-net | test-meeting | comment                                  |
 |------------------------------------|-------|----------|--------------|------------------------------------------|
@@ -72,7 +144,7 @@ avg=2
         --max-states 8
 ```
 
-When training with the M subset, the WERs are
+When training with the M subset, the CERs are
 
 |                                    |   dev  | test-net  | test-meeting  | comment                                   |
 |------------------------------------|--------|-----------|---------------|-------------------------------------------|
@@ -81,7 +153,7 @@ When training with the M subset, the WERs are
 | fast beam search (set as default)  | 10.18  | 11.10     | 19.32         | --epoch 29, --avg 11, --max-duration 1500 |
 
 
-When training with the S subset, the WERs are
+When training with the S subset, the CERs are
 
 |                                    |  dev   | test-net  | test-meeting  | comment                                   |
 |------------------------------------|--------|-----------|---------------|-------------------------------------------|