mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-09-18 13:34:20 +00:00
112 lines
2.8 KiB
Markdown
112 lines
2.8 KiB
Markdown
# Results
|
|
|
|
|
|
### IWSLT Tunisian training results (Stateless Pruned Transducer)
|
|
|
|
#### 2023-06-01
|
|
|
|
|
|
| Decoding method | dev WER | test WER | comment |
|
|
|------------------------------------|------------|------------|------------------------------------------|
|
|
| modified beam search | 47.6 | 51.2 | --epoch 20, --avg 10 |
|
|
|
|
The training command for reproducing is given below:
|
|
|
|
```
|
|
export CUDA_VISIBLE_DEVICES="0,1,2,3"
|
|
|
|
|
|
|
|
./pruned_transducer_stateless5/train.py \
|
|
--world-size 4 \
|
|
--num-epochs 20 \
|
|
--start-epoch 1 \
|
|
--exp-dir pruned_transducer_stateless5/exp \
|
|
--max-duration 300 \
|
|
--num-buckets 50
|
|
```
|
|
|
|
The tensorboard training log can be found at
|
|
https://tensorboard.dev/experiment/yBijWJSPSGuBqMwTZ509lA/
|
|
|
|
The decoding command is:
|
|
```
|
|
for method in modified_beam_search; do
|
|
./pruned_transducer_stateless5/decode.py \
|
|
--epoch 15 \
|
|
--beam-size 20 \
|
|
--avg 5 \
|
|
--exp-dir ./pruned_transducer_stateless5/exp \
|
|
--max-duration 400 \
|
|
--decoding-method modified_beam_search \
|
|
--max-sym-per-frame 1 \
|
|
--num-encoder-layers 12 \
|
|
--dim-feedforward 1024 \
|
|
--nhead 8 \
|
|
--encoder-dim 256 \
|
|
--decoder-dim 256 \
|
|
--joiner-dim 256 \
|
|
--use-averaged-model true
|
|
done
|
|
```
|
|
|
|
### IWSLT Tunisian training results (Zipformer)
|
|
|
|
#### 2023-06-01
|
|
|
|
You can find a pretrained model, training logs, decoding logs, and decoding results at:
|
|
https://tensorboard.dev/experiment/yLE399ZPTzePG8B39jRyOw/
|
|
|
|
|
|
|
|
| Decoding method | dev WER | test WER | comment |
|
|
|------------------------------------|------------|------------|------------------------------------------|
|
|
| modified beam search | 47.6 | 51.2 | --epoch 20, --avg 10 |
|
|
|
|
|
|
|
|
To reproduce the above result, use the following commands for training:
|
|
|
|
# Note: the model was trained on V-100 32GB GPU
|
|
|
|
```
|
|
export CUDA_VISIBLE_DEVICES="0,1"
|
|
./zipformer/train.py \
|
|
--world-size 2 \
|
|
--num-epochs 20 \
|
|
--start-epoch 1 \
|
|
--use-fp16 1 \
|
|
--exp-dir zipformer/exp \
|
|
--causal 0 \
|
|
--num-encoder-layers 2,2,2,2,2,2 \
|
|
--feedforward-dim 512,768,1024,1536,1024,768 \
|
|
--encoder-dim 192,256,384,512,384,256 \
|
|
--encoder-unmasked-dim 192,192,256,256,256,192 \
|
|
--max-duration 800 \
|
|
--prune-range 10
|
|
|
|
```
|
|
|
|
The decoding command is:
|
|
|
|
```
|
|
for method in modified_beam_search; do
|
|
./zipformer/decode.py \
|
|
--epoch 20 \
|
|
--beam-size 20 \
|
|
--avg 13 \
|
|
--exp-dir ./zipformer/exp\
|
|
--max-duration 800 \
|
|
--decoding-method $method \
|
|
--num-encoder-layers 2,2,2,2,2,2 \
|
|
--feedforward-dim 512,768,1024,1536,1024,768 \
|
|
--encoder-dim 192,256,384,512,384,256 \
|
|
--encoder-unmasked-dim 192,192,256,256,256,192
|
|
--use-averaged-model true
|
|
done
|
|
```
|
|
|
|
|
|
|
|
|