mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-09-08 08:34:19 +00:00
minor updates
This commit is contained in:
parent
4eb03bf519
commit
6eb141f0c5
@ -13,9 +13,9 @@ It's reworked Zipformer with Pruned RNNT loss.
|
|||||||
|
|
||||||
| | test | dev | comment |
|
| | test | dev | comment |
|
||||||
|------------------------|------|------|-----------------------------------------|
|
|------------------------|------|------|-----------------------------------------|
|
||||||
| greedy search | 4.73 | 4.54 | --epoch 38 --avg 14 |
|
| greedy search | 4.67 | 4.37 | --epoch 55 --avg 17 |
|
||||||
| modified beam search | 4.49 | 4.27 | --epoch 40 --avg 12 |
|
| modified beam search | 4.40 | 4.13 | --epoch 55 --avg 17 |
|
||||||
| fast beam search | 4.65 | 4.4 | --epoch 40 --avg 12 |
|
| fast beam search | 4.60 | 4.31 | --epoch 55 --avg 17 |
|
||||||
|
|
||||||
Command for training is:
|
Command for training is:
|
||||||
```bash
|
```bash
|
||||||
@ -25,118 +25,77 @@ export CUDA_VISIBLE_DEVICES="0,1"
|
|||||||
|
|
||||||
./zipformer/train.py \
|
./zipformer/train.py \
|
||||||
--world-size 2 \
|
--world-size 2 \
|
||||||
--num-epochs 40 \
|
--num-epochs 60 \
|
||||||
--start-epoch 1 \
|
--start-epoch 1 \
|
||||||
--use-fp16 1 \
|
--use-fp16 1 \
|
||||||
--context-size 1 \
|
--context-size 1 \
|
||||||
--enable-musan 0 \
|
--enable-musan 0 \
|
||||||
--exp-dir zipformer/exp \
|
--exp-dir zipformer/exp \
|
||||||
--max-duration 1000
|
|
||||||
```
|
|
||||||
|
|
||||||
Command for decoding is:
|
|
||||||
```bash
|
|
||||||
./zipformer/decode.py \
|
|
||||||
--epoch 38 \
|
|
||||||
--avg 14 \
|
|
||||||
--exp-dir ./zipformer/exp \
|
|
||||||
--lang-dir data/lang_char \
|
|
||||||
--context-size 1 \
|
|
||||||
--decoding-method greedy_search
|
|
||||||
|
|
||||||
for m in modified_beam_search fast_beam_search ; do
|
|
||||||
./zipformer/decode.py \
|
|
||||||
--epoch 40 \
|
|
||||||
--avg 12 \
|
|
||||||
--exp-dir ./zipformer/exp \
|
|
||||||
--lang-dir data/lang_char \
|
|
||||||
--context-size 1 \
|
|
||||||
--decoding-method $m
|
|
||||||
done
|
|
||||||
```
|
|
||||||
|
|
||||||
Note that results below are produced by model trained on data without speed perturbation applied.
|
|
||||||
|
|
||||||
**⚠️ If you prefer to have the speed perturbation disabled, please pass `false` to `--perturb-speed` of the `prepare.sh` script as demonstrated below.**
|
|
||||||
|
|
||||||
##### normal-scaled model, number of model parameters: 73412551, i.e., 73.41 M
|
|
||||||
|
|
||||||
| | test | dev | comment |
|
|
||||||
|------------------------|------|------|-----------------------------------------|
|
|
||||||
| greedy search | 4.92 | 4.61 | --epoch 90 --avg 40 --max-duration 1200 |
|
|
||||||
| modified beam search | 4.65 | 4.34 | --epoch 90 --avg 40 --max-duration 1200 |
|
|
||||||
| fast beam search | 4.83 | 4.52 | --epoch 90 --avg 40 --max-duration 1200 |
|
|
||||||
|
|
||||||
Command for training is:
|
|
||||||
```bash
|
|
||||||
./prepare.sh --perturb-speed false
|
|
||||||
|
|
||||||
export CUDA_VISIBLE_DEVICES="0,1"
|
|
||||||
|
|
||||||
./zipformer/train.py \
|
|
||||||
--world-size 2 \
|
|
||||||
--num-epochs 150 \
|
|
||||||
--start-epoch 1 \
|
|
||||||
--use-fp16 1 \
|
|
||||||
--context-size 1 \
|
|
||||||
--exp-dir zipformer/exp \
|
|
||||||
--max-duration 1000 \
|
--max-duration 1000 \
|
||||||
--lr-epochs 18
|
--enable-musan 0 \
|
||||||
|
--base-lr 0.045 \
|
||||||
|
--lr-batches 7500 \
|
||||||
|
--lr-epochs 18 \
|
||||||
|
--spec-aug-time-warp-factor 20
|
||||||
```
|
```
|
||||||
|
|
||||||
Command for decoding is:
|
Command for decoding is:
|
||||||
```bash
|
```bash
|
||||||
for m in greedy_search modified_beam_search fast_beam_search ; do
|
for m in greedy_search modified_beam_search fast_beam_search ; do
|
||||||
./zipformer/decode.py \
|
./zipformer/decode.py \
|
||||||
--epoch 90 \
|
--epoch 55 \
|
||||||
--avg 40 \
|
--avg 17 \
|
||||||
--exp-dir ./zipformer/exp \
|
--exp-dir ./zipformer/exp \
|
||||||
--lang-dir data/lang_char \
|
--lang-dir data/lang_char \
|
||||||
--context-size 1 \
|
--context-size 1 \
|
||||||
--max-duration 1200 \
|
|
||||||
--decoding-method $m
|
--decoding-method $m
|
||||||
done
|
done
|
||||||
```
|
```
|
||||||
|
Pretrained models, training logs, decoding logs, tensorboard and decoding results
|
||||||
|
are available at
|
||||||
|
<https://huggingface.co/zrjin/icefall-asr-aishell-zipformer-2023-10-24>
|
||||||
|
|
||||||
|
|
||||||
##### small-scaled model, number of model parameters: 30167139, i.e., 30.17 M
|
##### small-scaled model, number of model parameters: 30167139, i.e., 30.17 M
|
||||||
|
|
||||||
| | test | dev | comment |
|
| | test | dev | comment |
|
||||||
|------------------------|------|------|-----------------------------------------|
|
|------------------------|------|------|-----------------------------------------|
|
||||||
| greedy search | 5.15 | 4.93 | --epoch 90 --avg 40 --max-duration 1200 |
|
| greedy search | 4.97 | 4.67 | --epoch 55 --avg 21 |
|
||||||
| modified beam search | 4.90 | 4.68 | --epoch 90 --avg 40 --max-duration 1200 |
|
| modified beam search | 4.67 | 4.40 | --epoch 55 --avg 21 |
|
||||||
| fast beam search | 5.08 | 4.85 | --epoch 90 --avg 40 --max-duration 1200 |
|
| fast beam search | 4.85 | 4.61 | --epoch 55 --avg 21 |
|
||||||
|
|
||||||
Command for training is:
|
Command for training is:
|
||||||
```bash
|
```bash
|
||||||
./prepare.sh --perturb-speed false
|
|
||||||
|
|
||||||
export CUDA_VISIBLE_DEVICES="0,1"
|
export CUDA_VISIBLE_DEVICES="0,1"
|
||||||
|
|
||||||
./zipformer/train.py \
|
./zipformer/train.py \
|
||||||
--world-size 2 \
|
--world-size 2 \
|
||||||
--num-epochs 100 \
|
--num-epochs 60 \
|
||||||
--start-epoch 1 \
|
--start-epoch 1 \
|
||||||
--use-fp16 1 \
|
--use-fp16 1 \
|
||||||
--context-size 1 \
|
--context-size 1 \
|
||||||
--exp-dir zipformer/exp-small \
|
--exp-dir zipformer/exp-small \
|
||||||
--max-duration 1200 \
|
--enable-musan 0 \
|
||||||
|
--base-lr 0.045 \
|
||||||
|
--lr-batches 7500 \
|
||||||
--lr-epochs 18 \
|
--lr-epochs 18 \
|
||||||
|
--spec-aug-time-warp-factor 20 \
|
||||||
--num-encoder-layers 2,2,2,2,2,2 \
|
--num-encoder-layers 2,2,2,2,2,2 \
|
||||||
--feedforward-dim 512,768,768,768,768,768 \
|
--feedforward-dim 512,768,768,768,768,768 \
|
||||||
--encoder-dim 192,256,256,256,256,256 \
|
--encoder-dim 192,256,256,256,256,256 \
|
||||||
--encoder-unmasked-dim 192,192,192,192,192,192
|
--encoder-unmasked-dim 192,192,192,192,192,192 \
|
||||||
|
--max-duration 1200
|
||||||
```
|
```
|
||||||
|
|
||||||
Command for decoding is:
|
Command for decoding is:
|
||||||
```bash
|
```bash
|
||||||
for m in greedy_search modified_beam_search fast_beam_search ; do
|
for m in greedy_search modified_beam_search fast_beam_search ; do
|
||||||
./zipformer/decode.py \
|
./zipformer/decode.py \
|
||||||
--epoch 90 \
|
--epoch 55 \
|
||||||
--avg 40 \
|
--avg 21 \
|
||||||
--exp-dir ./zipformer/exp-small \
|
--exp-dir ./zipformer/exp-small \
|
||||||
--lang-dir data/lang_char \
|
--lang-dir data/lang_char \
|
||||||
--context-size 1 \
|
--context-size 1 \
|
||||||
--max-duration 1200 \
|
|
||||||
--decoding-method $m \
|
--decoding-method $m \
|
||||||
--num-encoder-layers 2,2,2,2,2,2 \
|
--num-encoder-layers 2,2,2,2,2,2 \
|
||||||
--feedforward-dim 512,768,768,768,768,768 \
|
--feedforward-dim 512,768,768,768,768,768 \
|
||||||
@ -145,6 +104,60 @@ for m in greedy_search modified_beam_search fast_beam_search ; do
|
|||||||
done
|
done
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Pretrained models, training logs, decoding logs, tensorboard and decoding results
|
||||||
|
are available at
|
||||||
|
<https://huggingface.co/zrjin/icefall-asr-aishell-zipformer-small-2023-10-24/>
|
||||||
|
|
||||||
|
##### large-scaled model, number of model parameters: 157285130, i.e., 157.29 M
|
||||||
|
|
||||||
|
| | test | dev | comment |
|
||||||
|
|------------------------|------|------|-----------------------------------------|
|
||||||
|
| greedy search | 4.49 | 4.22 | --epoch 56 --avg 23 |
|
||||||
|
| modified beam search | 4.28 | 4.03 | --epoch 56 --avg 23 |
|
||||||
|
| fast beam search | 4.44 | 4.18 | --epoch 56 --avg 23 |
|
||||||
|
|
||||||
|
Command for training is:
|
||||||
|
```bash
|
||||||
|
export CUDA_VISIBLE_DEVICES="0,1"
|
||||||
|
|
||||||
|
./zipformer/train.py \
|
||||||
|
--world-size 2 \
|
||||||
|
--num-epochs 60 \
|
||||||
|
--use-fp16 1 \
|
||||||
|
--context-size 1 \
|
||||||
|
--exp-dir ./zipformer/exp-large \
|
||||||
|
--enable-musan 0 \
|
||||||
|
--lr-batches 7500 \
|
||||||
|
--lr-epochs 18 \
|
||||||
|
--spec-aug-time-warp-factor 20 \
|
||||||
|
--num-encoder-layers 2,2,4,5,4,2 \
|
||||||
|
--feedforward-dim 512,768,1536,2048,1536,768 \
|
||||||
|
--encoder-dim 192,256,512,768,512,256 \
|
||||||
|
--encoder-unmasked-dim 192,192,256,320,256,192 \
|
||||||
|
--max-duration 800
|
||||||
|
```
|
||||||
|
|
||||||
|
Command for decoding is:
|
||||||
|
```bash
|
||||||
|
for m in greedy_search modified_beam_search fast_beam_search ; do
|
||||||
|
./zipformer/decode.py \
|
||||||
|
--epoch 56 \
|
||||||
|
--avg 23 \
|
||||||
|
--exp-dir ./zipformer/exp-small \
|
||||||
|
--lang-dir data/lang_char \
|
||||||
|
--context-size 1 \
|
||||||
|
--decoding-method $m \
|
||||||
|
--num-encoder-layers 2,2,4,5,4,2 \
|
||||||
|
--feedforward-dim 512,768,1536,2048,1536,768 \
|
||||||
|
--encoder-dim 192,256,512,768,512,256 \
|
||||||
|
--encoder-unmasked-dim 192,192,256,320,256,192
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
Pretrained models, training logs, decoding logs, tensorboard and decoding results
|
||||||
|
are available at
|
||||||
|
<https://huggingface.co/zrjin/icefall-asr-aishell-zipformer-large-2023-10-24/>
|
||||||
|
|
||||||
#### Pruned transducer stateless 7 streaming
|
#### Pruned transducer stateless 7 streaming
|
||||||
[./pruned_transducer_stateless7_streaming](./pruned_transducer_stateless7_streaming)
|
[./pruned_transducer_stateless7_streaming](./pruned_transducer_stateless7_streaming)
|
||||||
|
|
||||||
|
@ -2,49 +2,6 @@
|
|||||||
|
|
||||||
### Aishell2 char-based training results
|
### Aishell2 char-based training results
|
||||||
|
|
||||||
#### Zipformer
|
|
||||||
|
|
||||||
[./zipformer](./zipformer)
|
|
||||||
|
|
||||||
It's reworked Zipformer with Pruned RNNT loss, note that results below are produced by model trained on data without speed perturbation applied.
|
|
||||||
|
|
||||||
**⚠️ If you prefer to have the speed perturbation disabled, please pass `false` to `--perturb-speed` of the `prepare.sh` script as demonstrated below.**
|
|
||||||
|
|
||||||
| | dev-ios | test-ios | comment |
|
|
||||||
|---------------------------------------|---------|----------|----------------------------------|
|
|
||||||
| greedy search | 5.58 | 5.94 | --epoch 25, --avg 5, --max-duration 200 |
|
|
||||||
| modified beam search (set as default) | 5.45 | 5.86 | --epoch 25, --avg 5, --max-duration 200 |
|
|
||||||
| fast beam search (set as default) | 5.52 | 5.91 | --epoch 25, --avg 5, --max-duration 200 |
|
|
||||||
| fast beam search oracle | 1.65 | 1.71 | --epoch 25, --avg 5, --max-duration 200 |
|
|
||||||
| fast beam search nbest LG | 6.14 | 6.72 | --epoch 25, --avg 5, --max-duration 200 |
|
|
||||||
|
|
||||||
The training command for reproducing is given below:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./prepare.sh --perturb-speed false
|
|
||||||
|
|
||||||
export CUDA_VISIBLE_DEVICES="0,1"
|
|
||||||
|
|
||||||
./zipformer/train.py \
|
|
||||||
--world-size 2 \
|
|
||||||
--lang-dir data/lang_char \
|
|
||||||
--num-epochs 25 \
|
|
||||||
--start-epoch 1 \
|
|
||||||
--max-duration 1000 \
|
|
||||||
--use-fp16 1
|
|
||||||
```
|
|
||||||
|
|
||||||
The decoding command is:
|
|
||||||
```bash
|
|
||||||
for method in greedy_search modified_beam_search fast_beam_search fast_beam_search_nbest_oracle fast_beam_search_LG; do
|
|
||||||
./pruned_transducer_stateless5/decode.py \
|
|
||||||
--epoch 25 \
|
|
||||||
--avg 5 \
|
|
||||||
--exp-dir ./zipformer/exp \
|
|
||||||
--decoding-method $method \
|
|
||||||
done
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Pruned transducer stateless 5
|
#### Pruned transducer stateless 5
|
||||||
|
|
||||||
Using the codes from this commit https://github.com/k2-fsa/icefall/pull/465.
|
Using the codes from this commit https://github.com/k2-fsa/icefall/pull/465.
|
||||||
|
@ -2,50 +2,6 @@
|
|||||||
|
|
||||||
### Aishell4 Char training results (Pruned Transducer Stateless5)
|
### Aishell4 Char training results (Pruned Transducer Stateless5)
|
||||||
|
|
||||||
#### 2023-08-14
|
|
||||||
|
|
||||||
#### Zipformer
|
|
||||||
|
|
||||||
[./zipformer](./zipformer)
|
|
||||||
|
|
||||||
It's reworked Zipformer with Pruned RNNT loss, note that results below are produced by model trained on data without speed perturbation applied.
|
|
||||||
|
|
||||||
**⚠️ If you prefer to have the speed perturbation disabled, please pass `false` to `--perturb-speed` of the `prepare.sh` script as demonstrated below.**
|
|
||||||
|
|
||||||
| | test | comment |
|
|
||||||
|------------------------|------|---------------------------------------|
|
|
||||||
| greedy search | 40.77 | --epoch 45 --avg 6 --max-duration 200 |
|
|
||||||
| modified beam search | 40.39 | --epoch 45 --avg 6 --max-duration 200 |
|
|
||||||
| fast beam search | 46.51 | --epoch 45 --avg 6 --max-duration 200 |
|
|
||||||
|
|
||||||
Command for training is:
|
|
||||||
```bash
|
|
||||||
./prepare.sh --perturb-speed false
|
|
||||||
|
|
||||||
export CUDA_VISIBLE_DEVICES="0,1"
|
|
||||||
|
|
||||||
./zipformer/train.py \
|
|
||||||
--world-size 2 \
|
|
||||||
--num-epochs 45 \
|
|
||||||
--start-epoch 1 \
|
|
||||||
--use-fp16 1 \
|
|
||||||
--exp-dir zipformer/exp \
|
|
||||||
--max-duration 1000
|
|
||||||
```
|
|
||||||
|
|
||||||
Command for decoding is:
|
|
||||||
```bash
|
|
||||||
for m in greedy_search modified_beam_search fast_beam_search ; do
|
|
||||||
./zipformer/decode.py \
|
|
||||||
--epoch 45 \
|
|
||||||
--avg 6 \
|
|
||||||
--exp-dir ./zipformer/exp \
|
|
||||||
--lang-dir data/lang_char \
|
|
||||||
--decoding-method $m
|
|
||||||
done
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
#### 2022-06-13
|
#### 2022-06-13
|
||||||
|
|
||||||
Using the codes from this PR https://github.com/k2-fsa/icefall/pull/399.
|
Using the codes from this PR https://github.com/k2-fsa/icefall/pull/399.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user