Zengwei Yao 00c48ec1f3
Model average (#344)
* First upload of model average codes.

* minor fix

* update decode file

* update .flake8

* rename pruned_transducer_stateless3 to pruned_transducer_stateless4

* change epoch number counter starting from 1 instead of 0

* minor fix of pruned_transducer_stateless4/train.py

* refactor the checkpoint.py

* minor fix, update docs, and modify the epoch number to count from 1 in the pruned_transducer_stateless4/decode.py

* update author info

* add docs of the scaling in function average_checkpoints_with_averaged_model
2022-05-05 21:20:04 +08:00
..
2022-04-04 13:37:10 +08:00
2022-04-22 15:54:59 +08:00
2022-04-22 15:54:59 +08:00
2022-04-29 15:49:45 +08:00
2022-04-29 15:49:45 +08:00
2021-08-04 14:53:02 +08:00

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

Encoder Decoder Comment
transducer Conformer LSTM
transducer_stateless Conformer Embedding + Conv1d Using optimized_transducer from computing RNN-T loss
transducer_stateless2 Conformer Embedding + Conv1d Using torchaudio for computing RNN-T loss
transducer_lstm LSTM LSTM
transducer_stateless_multi_datasets Conformer Embedding + Conv1d Using data from GigaSpeech as extra training data
pruned_transducer_stateless Conformer Embedding + Conv1d Using k2 pruned RNN-T loss
pruned_transducer_stateless2 Conformer(modified) Embedding + Conv1d Using k2 pruned RNN-T loss
pruned_transducer_stateless3 Conformer(modified) Embedding + Conv1d Using k2 pruned RNN-T loss + using GigaSpeech as extra training data

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.