icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

History

* Minor fix to conformer-mmi

* Minor fixes

* Fix decode.py

* add training files

* train with ctc warmup

* add pruned_transducer_stateless7_mmi

* add zipformer_mmi/mmi_decode.py, using HP as decoding graph

* add mmi_decode.py

* remove pruned_transducer_stateless7_mmi

* rename zipformer_mmi/train_with_ctc.py as zipformer_mmi/train.py

* remove unused method

* rename mmi_decode.py

* add export.py pretrained.py jit_pretrained.py ...

* add RESULTS.md

* add CI test

* add docs

* add README.md

Co-authored-by: pkufool <wkang.pku@gmail.com>

2022-12-11 21:30:39 +08:00

__init__.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

asr_datamodule.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

decode.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

encoder_interface.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

export.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

jit_pretrained.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

model.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

optim.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

pretrained.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

README.md

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

scaling_converter.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

scaling.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

test_model.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

train.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

zipformer.py

Add Zipformer-MMI (#746 )

2022-12-11 21:30:39 +08:00

README.md

This recipe implements Zipformer-MMI model.

See https://k2-fsa.github.io/icefall/recipes/librispeech/zipformer_mmi.html for detailed tutorials.

It uses CTC loss for warm-up and then switches to MMI loss during training.

For decoding, it uses HP (H is ctc_topo, P is token-level bi-gram) as decoding graph. Supported decoding methods are:

1best. Extract the best path from the decoding lattice as the decoding result.
nbest. Extract n paths from the decoding lattice; the path with the highest score is the decoding result.
nbest-rescoring-LG. Extract n paths from the decoding lattice, rescore them with an word-level 3-gram LM, the path with the highest score is the decoding result.
nbest-rescoring-3-gram. Extract n paths from the decoding lattice, rescore them with an token-level 3-gram LM, the path with the highest score is the decoding result.
nbest-rescoring-4-gram. Extract n paths from the decoding lattice, rescore them with an token-level 4-gram LM, the path with the highest score is the decoding result.

Experimental results training on train-clean-100 (epoch-30-avg-10):

1best. 6.43 & 17.44
nbest, nbest-scale=1.2, 6.43 & 17.45
nbest-rescoring-LG, nbest-scale=1.2, 5.87 & 16.35
nbest-rescoring-3-gram, nbest-scale=1.2, 6.19 & 16.57
nbest-rescoring-4-gram, nbest-scale=1.2, 5.87 & 16.07

Experimental results training on full librispeech (epoch-30-avg-10):

1best. 2.54 & 5.65
nbest, nbest-scale=1.2, 2.54 & 5.66
nbest-rescoring-LG, nbest-scale=1.2, 2.49 & 5.42
nbest-rescoring-3-gram, nbest-scale=1.2, 2.52 & 5.62
nbest-rescoring-4-gram, nbest-scale=1.2, 2.5 & 5.51