mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-09-19 05:54:20 +00:00
update README.md and RESULTS.md
This commit is contained in:
parent
40d2bda318
commit
11ea660c86
@ -47,6 +47,7 @@ We place an additional Conv1d layer right after the input embedding layer.
|
|||||||
| `conformer-ctc` | Conformer | Use auxiliary attention head |
|
| `conformer-ctc` | Conformer | Use auxiliary attention head |
|
||||||
| `conformer-ctc2` | Reworked Conformer | Use auxiliary attention head |
|
| `conformer-ctc2` | Reworked Conformer | Use auxiliary attention head |
|
||||||
| `conformer-ctc3` | Reworked Conformer | Streaming version + delay penalty |
|
| `conformer-ctc3` | Reworked Conformer | Streaming version + delay penalty |
|
||||||
|
| `zipformer` | Upgraded Zipformer | Use auxiliary transducer head | The latest recipe |
|
||||||
|
|
||||||
# MMI
|
# MMI
|
||||||
|
|
||||||
|
@ -1,5 +1,69 @@
|
|||||||
## Results
|
## Results
|
||||||
|
|
||||||
|
### zipformer (zipformer + pruned stateless transducer + CTC)
|
||||||
|
|
||||||
|
See <https://github.com/k2-fsa/icefall/pull/1111> for more details.
|
||||||
|
|
||||||
|
[zipformer](./zipformer)
|
||||||
|
|
||||||
|
#### Non-streaming
|
||||||
|
|
||||||
|
##### normal-scaled model, number of model parameters: 65805511, i.e., 65.81 M
|
||||||
|
|
||||||
|
The tensorboard log can be found at
|
||||||
|
<https://tensorboard.dev/experiment/Lo3Qlad7TP68ulM2K0ixgQ/>
|
||||||
|
|
||||||
|
You can find a pretrained model, training logs, decoding logs, and decoding results at:
|
||||||
|
<https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-transducer-ctc-2023-06-13>
|
||||||
|
|
||||||
|
You can use <https://github.com/k2-fsa/sherpa> to deploy it.
|
||||||
|
|
||||||
|
Results of the CTC head:
|
||||||
|
|
||||||
|
| decoding method | test-clean | test-other | comment |
|
||||||
|
|-------------------------|------------|------------|--------------------|
|
||||||
|
| ctc-decoding | 2.40 | 5.66 | --epoch 40 --avg 16 |
|
||||||
|
| 1best | 2.46 | 5.11 | --epoch 40 --avg 16 |
|
||||||
|
| nbest | 2.46 | 5.11 | --epoch 40 --avg 16 |
|
||||||
|
| nbest-rescoring | 2.37 | 4.93 | --epoch 40 --avg 16 |
|
||||||
|
| whole-lattice-rescoring | 2.37 | 4.88 | --epoch 40 --avg 16 |
|
||||||
|
|
||||||
|
The training command is:
|
||||||
|
```bash
|
||||||
|
export CUDA_VISIBLE_DEVICES="0,1,2,3"
|
||||||
|
./zipformer/train.py \
|
||||||
|
--world-size 4 \
|
||||||
|
--num-epochs 40 \
|
||||||
|
--start-epoch 1 \
|
||||||
|
--use-fp16 1 \
|
||||||
|
--exp-dir zipformer/exp-ctc-rnnt \
|
||||||
|
--causal 0 \
|
||||||
|
--use-transducer 1 \
|
||||||
|
--use-ctc 1 \
|
||||||
|
--ctc-loss-scale 0.2 \
|
||||||
|
--full-libri 1 \
|
||||||
|
--max-duration 1000
|
||||||
|
```
|
||||||
|
|
||||||
|
The decoding command is:
|
||||||
|
```bash
|
||||||
|
export CUDA_VISIBLE_DEVICES="0"
|
||||||
|
for m in ctc-decoding 1best nbest nbest-rescoring whole-lattice-rescoring; do
|
||||||
|
./zipformer/ctc_decode.py \
|
||||||
|
--epoch 40 \
|
||||||
|
--avg 16 \
|
||||||
|
--exp-dir zipformer/exp-ctc-rnnt \
|
||||||
|
--use-transducer 1 \
|
||||||
|
--use-ctc 1 \
|
||||||
|
--max-duration 300 \
|
||||||
|
--causal 0 \
|
||||||
|
--num-paths 100 \
|
||||||
|
--nbest-scale 1.0 \
|
||||||
|
--hlg-scale 0.6 \
|
||||||
|
--decoding-method $m
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
### zipformer (zipformer + pruned stateless transducer)
|
### zipformer (zipformer + pruned stateless transducer)
|
||||||
|
|
||||||
See <https://github.com/k2-fsa/icefall/pull/1058> for more details.
|
See <https://github.com/k2-fsa/icefall/pull/1058> for more details.
|
||||||
@ -285,7 +349,7 @@ export CUDA_VISIBLE_DEVICES="0,1"
|
|||||||
--lr-epochs 100 \
|
--lr-epochs 100 \
|
||||||
--lr-batches 100000 \
|
--lr-batches 100000 \
|
||||||
--bpe-model icefall-asr-librispeech-pruned-transducer-stateless7-2022-11-11/data/lang_bpe_500/bpe.model \
|
--bpe-model icefall-asr-librispeech-pruned-transducer-stateless7-2022-11-11/data/lang_bpe_500/bpe.model \
|
||||||
--do-finetune True \
|
--do-finetune True \
|
||||||
--use-mux True \
|
--use-mux True \
|
||||||
--finetune-ckpt icefall-asr-librispeech-pruned-transducer-stateless7-2022-11-11/exp/pretrain.pt \
|
--finetune-ckpt icefall-asr-librispeech-pruned-transducer-stateless7-2022-11-11/exp/pretrain.pt \
|
||||||
--max-duration 500
|
--max-duration 500
|
||||||
|
Loading…
x
Reference in New Issue
Block a user