History

Modified conformer with multi datasets (#312 )

* Copy files for editing.

* Use librispeech + gigaspeech with modified conformer.

* Support specifying number of workers for on-the-fly feature extraction.

* Feature extraction code for GigaSpeech.

* Combine XL splits lazily during training.

* Fix warnings in decoding.

* Add decoding code for GigaSpeech.

* Fix decoding the gigaspeech dataset.

We have to use the decoder/joiner networks for the GigaSpeech dataset.

* Disable speed perturbe for XL subset.

* Compute the Nbest oracle WER for RNN-T decoding.

* Minor fixes.

* Minor fixes.

* Add results.

* Update results.

* Update CI.

* Update results.

* Fix style issues.

* Update results.

* Fix style issues.

2022-04-29 15:40:30 +08:00

conformer_ctc

Fix potential bugs in PyTorch that exist in label_smoothing. (#300 )

2022-04-08 13:41:33 +08:00

conformer_mmi

Change for asr_datamodule.py (#241 )

2022-03-14 00:30:58 +08:00

local

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

pruned_transducer_stateless

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

pruned_transducer_stateless2

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

pruned_transducer_stateless3

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

streaming_conformer_ctc

Reset seed at the beginning of each epoch. (#221 )

2022-02-21 15:16:39 +08:00

tdnn_lstm_ctc

Some cleanups

2022-04-04 13:37:10 +08:00

transducer

Fix some typos. (#329 )

2022-04-22 15:54:59 +08:00

transducer_lstm

Fix some typos. (#329 )

2022-04-22 15:54:59 +08:00

transducer_stateless

Support computing RNN-T loss with torchaudio (#316 )

2022-04-19 18:47:13 +08:00

transducer_stateless2

Support computing RNN-T loss with torchaudio (#316 )

2022-04-19 18:47:13 +08:00

transducer_stateless_multi_datasets

Don't use a lambda for dataloader's worker_init_fn. (#284 )

2022-03-31 20:32:00 +08:00

.gitignore

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

prepare_giga_speech.sh

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

prepare.sh

Add LG decoding (#277 )

2022-04-19 17:23:46 +08:00

README.md

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

RESULTS-100hours.md

Update result for full libri + GigaSpeech using transducer_stateless. (#231 )

2022-03-01 17:01:46 +08:00

RESULTS.md

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

index.html for how to run models in this recipe.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.