History

* keep model_avg on cpu

* explicitly convert model_avg to cpu

* minor fix

* remove device convertion for model_avg

* modify usage of the model device in train.py

* change model.device to next(model.parameters()).device for decoding

* assert params.start_epoch>0

* assert params.start_epoch>0, params.start_epoch

2022-05-07 10:42:34 +08:00

conformer_ctc

Fix potential bugs in PyTorch that exist in label_smoothing. (#300 )

2022-04-08 13:41:33 +08:00

conformer_mmi

Change for asr_datamodule.py (#241 )

2022-03-14 00:30:58 +08:00

local

Validate generated manifest files. (#338 )

2022-05-03 07:02:54 +08:00

pruned_transducer_stateless

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

pruned_transducer_stateless2

Keep model_avg on cpu (#348 )

2022-05-07 10:42:34 +08:00

pruned_transducer_stateless3

Fix decoding for gigaspeech in the libri + giga setup. (#345 )

2022-05-05 20:58:46 +08:00

pruned_transducer_stateless4

Keep model_avg on cpu (#348 )

2022-05-07 10:42:34 +08:00

streaming_conformer_ctc

Reset seed at the beginning of each epoch. (#221 )

2022-02-21 15:16:39 +08:00

tdnn_lstm_ctc

Some cleanups

2022-04-04 13:37:10 +08:00

transducer

Fix some typos. (#329 )

2022-04-22 15:54:59 +08:00

transducer_lstm

Fix some typos. (#329 )

2022-04-22 15:54:59 +08:00

transducer_stateless

Support computing RNN-T loss with torchaudio (#316 )

2022-04-19 18:47:13 +08:00

transducer_stateless2

Support computing RNN-T loss with torchaudio (#316 )

2022-04-19 18:47:13 +08:00

transducer_stateless_multi_datasets

Don't use a lambda for dataloader's worker_init_fn. (#284 )

2022-03-31 20:32:00 +08:00

.gitignore

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

prepare_giga_speech.sh

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

prepare.sh

Validate generated manifest files. (#338 )

2022-05-03 07:02:54 +08:00

README.md

Update results. (#340 )

2022-04-29 15:49:45 +08:00

RESULTS-100hours.md

Update result for full libri + GigaSpeech using transducer_stateless. (#231 )

2022-03-01 17:01:46 +08:00

RESULTS.md

Update results. (#340 )

2022-04-29 15:49:45 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.