From e36ea89112bb3d81602cb4df51bd68e6d06dc150 Mon Sep 17 00:00:00 2001
From: Yifan Yang <64255737+yfyeung@users.noreply.github.com>
Date: Wed, 1 Feb 2023 21:04:56 +0800
Subject: [PATCH] update result.md for pruned_transducer_stateless7_ctc_bs
(#865)
---
egs/librispeech/ASR/RESULTS.md | 105 ++++++++++++++++++++++++++++++---
1 file changed, 98 insertions(+), 7 deletions(-)
diff --git a/egs/librispeech/ASR/RESULTS.md b/egs/librispeech/ASR/RESULTS.md
index b30cf7c1f..a3e44f09c 100644
--- a/egs/librispeech/ASR/RESULTS.md
+++ b/egs/librispeech/ASR/RESULTS.md
@@ -93,13 +93,13 @@ results at:
Number of model parameters: 69136519, i.e., 69.14 M
-| | test-clean | test-other | comment |
-|--------------------------|------------|-------------|---------------------|
-| 1best | 2.54 | 5.65 | --epoch 30 --avg 10 |
-| nbest | 2.54 | 5.66 | --epoch 30 --avg 10 |
-| nbest-rescoring-LG | 2.49 | 5.42 | --epoch 30 --avg 10 |
-| nbest-rescoring-3-gram | 2.52 | 5.62 | --epoch 30 --avg 10 |
-| nbest-rescoring-4-gram | 2.5 | 5.51 | --epoch 30 --avg 10 |
+| | test-clean | test-other | comment |
+| ---------------------- | ---------- | ---------- | ------------------- |
+| 1best | 2.54 | 5.65 | --epoch 30 --avg 10 |
+| nbest | 2.54 | 5.66 | --epoch 30 --avg 10 |
+| nbest-rescoring-LG | 2.49 | 5.42 | --epoch 30 --avg 10 |
+| nbest-rescoring-3-gram | 2.52 | 5.62 | --epoch 30 --avg 10 |
+| nbest-rescoring-4-gram | 2.5 | 5.51 | --epoch 30 --avg 10 |
The training commands are:
```bash
@@ -134,6 +134,97 @@ for m in nbest nbest-rescoring-LG nbest-rescoring-3-gram nbest-rescoring-4-gram;
done
```
+### pruned_transducer_stateless7_ctc_bs (zipformer with transducer loss and ctc loss using blank skip)
+
+See https://github.com/k2-fsa/icefall/pull/730 for more details.
+
+[pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs)
+
+The tensorboard log can be found at
+
+
+You can find a pretrained model, training logs, decoding logs, and decoding
+results at:
+
+
+Number of model parameters: 76804822, i.e., 76.80 M
+
+Test on 8-card V100 cluster, with 4-card busy and 4-card idle.
+
+#### greedy_search
+
+| model | test-clean | test-other | decoding time(s) | comment |
+| ------------------------------------------------------------ | ---------- | ---------- | ---------------- | ------------------- |
+| [pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs) | 2.28 | 5.53 | 48.939 | --epoch 30 --avg 13 |
+| [pruned_transducer_stateless7_ctc](./pruned_transducer_stateless7_ctc) | 2.24 | 5.18 | 91.900 | --epoch 30 --avg 8 |
+
+- [pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs) applies blank skip both on training and decoding, and [pruned_transducer_stateless7_ctc](./pruned_transducer_stateless7_ctc) doesn`t apply blank skip.
+- Applying blank skip both on training and decoding is **1.88 times** faster than the model that doesn't apply blank skip without obvious performance loss.
+
+#### modified_beam_search
+
+| model | test-clean | test-other | decoding time(s) | comment |
+| ------------------------------------------------------------ | ---------- | ---------- | ---------------- | ------------------- |
+| [pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs) | 2.26 | 5.44 | 80.446 | --epoch 30 --avg 13 |
+| [pruned_transducer_stateless7_ctc](./pruned_transducer_stateless7_ctc) | 2.20 | 5.12 | 283.676 | --epoch 30 --avg 8 |
+
+- [pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs) applies blank skip both on training and decoding, and [pruned_transducer_stateless7_ctc](./pruned_transducer_stateless7_ctc) doesn`t apply blank skip.
+- Applying blank skip both on training and decoding is **3.53 times** faster than the model that doesn't apply blank skip without obvious performance loss.
+
+The training commands for the model using blank skip ([pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs)) are:
+
+```bash
+export CUDA_VISIBLE_DEVICES="0,1,2,3"
+
+./pruned_transducer_stateless7_ctc_bs/train.py \
+ --world-size 4 \
+ --num-epochs 30 \
+ --full-libri 1 \
+ --use-fp16 1 \
+ --max-duration 750 \
+ --exp-dir pruned_transducer_stateless7_ctc_bs/exp \
+ --feedforward-dims "1024,1024,2048,2048,1024" \
+ --ctc-loss-scale 0.2 \
+ --master-port 12535
+```
+
+The decoding commands for the transducer branch of the model using blank skip ([pruned_transducer_stateless7_ctc_bs](./pruned_transducer_stateless7_ctc_bs)) are:
+
+```bash
+for m in greedy_search modified_beam_search fast_beam_search; do
+ for epoch in 30; do
+ for avg in 15; do
+ ./pruned_transducer_stateless7_ctc_bs/ctc_guild_decode_bs.py \
+ --epoch $epoch \
+ --avg $avg \
+ --use-averaged-model 1 \
+ --exp-dir ./pruned_transducer_stateless7_ctc_bs/exp \
+ --feedforward-dims "1024,1024,2048,2048,1024" \
+ --max-duration 600 \
+ --decoding-method $m
+ done
+ done
+done
+```
+
+The decoding commands for the transducer branch of the model without blank skip ([pruned_transducer_stateless7_ctc](./pruned_transducer_stateless7_ctc)) are:
+
+```bash
+for m in greedy_search modified_beam_search fast_beam_search; do
+ for epoch in 30; do
+ for avg in 15; do
+ ./pruned_transducer_stateless7_ctc/decode.py \
+ --epoch $epoch \
+ --avg $avg \
+ --use-averaged-model 1 \
+ --exp-dir ./pruned_transducer_stateless7_ctc/exp \
+ --feedforward-dims "1024,1024,2048,2048,1024" \
+ --max-duration 600 \
+ --decoding-method $m
+ done
+ done
+done
+```
### pruned_transducer_stateless7_ctc (zipformer with transducer loss and ctc loss)