[Ready to merge]stateless6: states4 + hubert distillation. (#387 )

* a copy of stateless4 as base

* distillation with hubert

* fix typo

* example usage

* usage

* Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* fix comment

* add results of 100hours

* Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* check fairseq and quantization

* a short intro to distillation framework

* Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* add intro of statless6 in README

* fix type error of dst_manifest_dir

* Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* make export.py call stateless6/train.py instead of stateless2/train.py

* update results by stateless6

* adjust results format

* fix typo

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

2022-05-28 12:37:50 +08:00

2.4 KiB

Raw Blame History

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data
`pruned_transducer_stateless4`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless2 + save averaged models periodically during training
`pruned_transducer_stateless5`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + more layers + random combiner
`pruned_transducer_stateless6`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + distillation with hubert

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.

2.4 KiB Raw Blame History

Introduction

Transducers

2.4 KiB

Raw Blame History