added RESULTS.md

2024-03-11 10:59:32 +08:00 · 2024-03-11 10:59:32 +08:00 · 60691efddf
commit 60691efddf
parent 78b39d9a7d
3 changed files with 56 additions and 1 deletions
--- a/egs/aishell/ASR/README.md
+++ b/egs/aishell/ASR/README.md
@ -19,7 +19,9 @@ The following table lists the differences among them.
 | `transducer_stateless_modified`    | Conformer | Embedding + Conv1d | with modified transducer from `optimized_transducer`                     |
 | `transducer_stateless_modified-2`  | Conformer | Embedding + Conv1d | with modified transducer from `optimized_transducer` + extra data      |
 | `pruned_transducer_stateless3`     | Conformer (reworked) | Embedding + Conv1d | pruned RNN-T + reworked model with random combiner + using aidatatang_20zh as extra data|
-| `pruned_transducer_stateless7`     | Zipformer | Embedding | pruned RNN-T + zipformer encoder + stateless decoder with context-size 1 |
+| `pruned_transducer_stateless7`     | Zipformer | Embedding | pruned RNN-T + zipformer encoder + stateless decoder with context-size set to 1 |
+| `zipformer`                           | Upgraded Zipformer | Embedding + Conv1d | The latest recipe with context-size set to 1 |
+

 The decoder in `transducer_stateless` is modified from the paper
 [Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).
--- a/egs/mdcc/ASR/README.md
+++ b/egs/mdcc/ASR/README.md
@ -5,3 +5,15 @@ transcripts, collected from Cantonese audiobooks from Hong Kong. It comprises ph
 politics, education, culture, lifestyle and family domains, covering a wide range of topics. 

 Manuscript can be found at: https://arxiv.org/abs/2201.02419
+
+# Transducers
+
+
+
+|                                       | Encoder             | Decoder            | Comment                     |
+|---------------------------------------|---------------------|--------------------|-----------------------------|
+| `zipformer`                           | Upgraded Zipformer | Embedding + Conv1d | The latest recipe with context-size set to 1 |
+
+The decoder is modified from the paper
+[Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).
+We place an additional Conv1d layer right after the input embedding layer.
--- a/egs/mdcc/ASR/RESULTS.md
+++ b/egs/mdcc/ASR/RESULTS.md
@ -0,0 +1,41 @@
+## Results
+
+#### Zipformer
+
+See <https://github.com/k2-fsa/icefall/pull/1537>
+
+[./zipformer](./zipformer)
+
+##### normal-scaled model, number of model parameters: 74470867, i.e., 74.47 M
+
+|                        | test | valid | comment                                 |
+|------------------------|------|-------|-----------------------------------------|
+| greedy search          | 7.45 | 7.51  | --epoch 45 --avg 35                     |
+| modified beam search   | 6.68 | 6.73  | --epoch 45 --avg 35                     |
+| fast beam search       | 7.22 | 7.28  | --epoch 45 --avg 35                     |
+
+The training command:
+
+```
+export CUDA_VISIBLE_DEVICES="0,1,2,3"
+
+./zipformer/train.py \
+  --world-size 4 \
+  --start-epoch 1 \
+  --num-epochs 50 \
+  --use-fp16 1 \
+  --exp-dir ./zipformer/exp \
+  --max-duration 1000 
+```
+
+The decoding command:
+
+```
+ ./zipformer/decode.py \
+   --epoch 45 \
+   --avg 35 \
+   --exp-dir ./zipformer/exp \
+   --decoding-method greedy_search  # modified_beam_search
+```
+
+The pretrained model is available at:  https://huggingface.co/zrjin/icefall-asr-mdcc-zipformer-2024-03-11/