mirror of https://github.com/k2-fsa/icefall.git synced 2025-08-09 10:02:22 +00:00

History

* First upload of model average codes.

* minor fix

* update decode file

* update .flake8

* rename pruned_transducer_stateless3 to pruned_transducer_stateless4

* change epoch number counter starting from 1 instead of 0

* minor fix of pruned_transducer_stateless4/train.py

* refactor the checkpoint.py

* minor fix, update docs, and modify the epoch number to count from 1 in the pruned_transducer_stateless4/decode.py

* update author info

* add docs of the scaling in function average_checkpoints_with_averaged_model

2022-05-05 21:20:04 +08:00

conformer_ctc

Fix potential bugs in PyTorch that exist in label_smoothing. (#300 )

2022-04-08 13:41:33 +08:00

conformer_mmi

Change for asr_datamodule.py (#241 )

2022-03-14 00:30:58 +08:00

local

Validate generated manifest files. (#338 )

2022-05-03 07:02:54 +08:00

pruned_transducer_stateless

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

pruned_transducer_stateless2

Save batch to disk on OOM. (#343 )

2022-05-05 15:09:23 +08:00

pruned_transducer_stateless3

Fix decoding for gigaspeech in the libri + giga setup. (#345 )

2022-05-05 20:58:46 +08:00

pruned_transducer_stateless4

Model average (#344 )

2022-05-05 21:20:04 +08:00

streaming_conformer_ctc

Reset seed at the beginning of each epoch. (#221 )

2022-02-21 15:16:39 +08:00

tdnn_lstm_ctc

Some cleanups

2022-04-04 13:37:10 +08:00

transducer

Fix some typos. (#329 )

2022-04-22 15:54:59 +08:00

transducer_lstm

Fix some typos. (#329 )

2022-04-22 15:54:59 +08:00

transducer_stateless

Support computing RNN-T loss with torchaudio (#316 )

2022-04-19 18:47:13 +08:00

transducer_stateless2

Support computing RNN-T loss with torchaudio (#316 )

2022-04-19 18:47:13 +08:00

transducer_stateless_multi_datasets

Don't use a lambda for dataloader's worker_init_fn. (#284 )

2022-03-31 20:32:00 +08:00

.gitignore

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

prepare_giga_speech.sh

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

prepare.sh

Validate generated manifest files. (#338 )

2022-05-03 07:02:54 +08:00

README.md

Update results. (#340 )

2022-04-29 15:49:45 +08:00

RESULTS-100hours.md

Update result for full libri + GigaSpeech using transducer_stateless. (#231 )

2022-03-01 17:01:46 +08:00

RESULTS.md

Update results. (#340 )

2022-04-29 15:49:45 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.