mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-08 17:42:21 +00:00
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
This commit is contained in:
parent
770c495484
commit
c0a53271e2
11
README.md
11
README.md
@ -118,11 +118,12 @@ We provide a Colab notebook to run a pre-trained transducer conformer + stateles
|
|||||||
|
|
||||||
#### k2 pruned RNN-T
|
#### k2 pruned RNN-T
|
||||||
|
|
||||||
| Encoder | Params | test-clean | test-other |
|
| Encoder | Params | test-clean | test-other | epochs | devices |
|
||||||
|-----------------|--------|------------|------------|
|
|-----------------|--------|------------|------------|---------|------------|
|
||||||
| zipformer | 65.5M | 2.21 | 4.79 |
|
| zipformer | 65.5M | 2.21 | 4.79 | 50 | 4 32G-V100 |
|
||||||
| zipformer-small | 23.2M | 2.42 | 5.73 |
|
| zipformer-small | 23.2M | 2.42 | 5.73 | 50 | 2 32G-V100 |
|
||||||
| zipformer-large | 148.4M | 2.06 | 4.63 |
|
| zipformer-large | 148.4M | 2.06 | 4.63 | 50 | 4 32G-V100 |
|
||||||
|
| zipformer-large | 148.4M | 2.00 | 4.38 | 174 | 8 80G-A100 |
|
||||||
|
|
||||||
Note: No auxiliary losses are used in the training and no LMs are used
|
Note: No auxiliary losses are used in the training and no LMs are used
|
||||||
in the decoding.
|
in the decoding.
|
||||||
|
@ -245,6 +245,58 @@ for m in greedy_search modified_beam_search fast_beam_search; do
|
|||||||
done
|
done
|
||||||
```
|
```
|
||||||
|
|
||||||
|
##### large-scaled model, number of model parameters: 148439574, i.e., 148.4 M, trained on 8 80G-A100 GPUs
|
||||||
|
|
||||||
|
The tensorboard log can be found at
|
||||||
|
<https://tensorboard.dev/experiment/95TdNyEuQXaWK2PzFpD9yg/>
|
||||||
|
|
||||||
|
You can find a pretrained model, training logs, decoding logs, and decoding results at:
|
||||||
|
<https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-large-2023-10-26-8-a100>
|
||||||
|
|
||||||
|
You can use <https://github.com/k2-fsa/sherpa> to deploy it.
|
||||||
|
|
||||||
|
| decoding method | test-clean | test-other | comment |
|
||||||
|
|----------------------|------------|------------|-----------------------|
|
||||||
|
| greedy_search | 2.00 | 4.47 | --epoch 174 --avg 172 |
|
||||||
|
| modified_beam_search | 2.00 | 4.38 | --epoch 174 --avg 172 |
|
||||||
|
| fast_beam_search | 2.00 | 4.42 | --epoch 174 --avg 172 |
|
||||||
|
|
||||||
|
The training command is:
|
||||||
|
```bash
|
||||||
|
export CUDA_VISIBLE_DEVICES="0,1,2,3"
|
||||||
|
./zipformer/train.py \
|
||||||
|
--world-size 8 \
|
||||||
|
--num-epochs 174 \
|
||||||
|
--start-epoch 1 \
|
||||||
|
--use-fp16 1 \
|
||||||
|
--exp-dir zipformer/exp-large \
|
||||||
|
--causal 0 \
|
||||||
|
--num-encoder-layers 2,2,4,5,4,2 \
|
||||||
|
--feedforward-dim 512,768,1536,2048,1536,768 \
|
||||||
|
--encoder-dim 192,256,512,768,512,256 \
|
||||||
|
--encoder-unmasked-dim 192,192,256,320,256,192 \
|
||||||
|
--full-libri 1 \
|
||||||
|
--max-duration 2200
|
||||||
|
```
|
||||||
|
|
||||||
|
The decoding command is:
|
||||||
|
```bash
|
||||||
|
export CUDA_VISIBLE_DEVICES="0"
|
||||||
|
for m in greedy_search modified_beam_search fast_beam_search; do
|
||||||
|
./zipformer/decode.py \
|
||||||
|
--epoch 174 \
|
||||||
|
--avg 172 \
|
||||||
|
--exp-dir zipformer/exp-large \
|
||||||
|
--max-duration 600 \
|
||||||
|
--causal 0 \
|
||||||
|
--decoding-method $m \
|
||||||
|
--num-encoder-layers 2,2,4,5,4,2 \
|
||||||
|
--feedforward-dim 512,768,1536,2048,1536,768 \
|
||||||
|
--encoder-dim 192,256,512,768,512,256 \
|
||||||
|
--encoder-unmasked-dim 192,192,256,320,256,192
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
#### streaming
|
#### streaming
|
||||||
|
|
||||||
##### normal-scaled model, number of model parameters: 66110931, i.e., 66.11 M
|
##### normal-scaled model, number of model parameters: 66110931, i.e., 66.11 M
|
||||||
|
Loading…
x
Reference in New Issue
Block a user