History

Emformer with conv module and scaling mechanism (#389 )

* copy files from existing branch

* add rule in .flake8

* monir style fix

* fix typos

* add tail padding

* refactor, use fixed-length cache for batch decoding

* copy from streaming branch

* copy from streaming branch

* modify emformer states stack and unstack, streaming decoding, to be continued

* refactor Stream class

* remane streaming_feature_extractor.py

* refactor streaming decoding

* test states stack and unstack

* fix bugs, no grad, and num_proccessed_frames

* add modify_beam_search, fast_beam_search

* support torch.jit.export

* use torch.div

* copy from pruned_transducer_stateless4

* modify export.py

* add author info

* delete other test functions

* minor fix

* modify doc

* fix style

* minor fix doc

* minor fix

* minor fix doc

* update RESULTS.md

* fix typo

* add info

* fix typo

* fix doc

* add test function for conv module, and minor fix.

* add copyright info

* minor change of test_emformer.py

* fix doc of stack and unstack, test case with batch_size=1

* update README.md

2022-06-13 15:09:17 +08:00

conformer_ctc

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

conformer_mmi

Change for asr_datamodule.py (#241 )

2022-03-14 00:30:58 +08:00

conv_emformer_transducer_stateless

Emformer with conv module and scaling mechanism (#389 )

2022-06-13 15:09:17 +08:00

local

Replace ChunkedLilcomHdf5Writer with LilcomChunkyWriter. (#411 )

2022-06-09 11:18:52 +08:00

pruned_stateless_emformer_rnnt2

Fix exporting emformer with torchscript using torch 1.6.0 (#402 )

2022-06-07 09:19:37 +08:00

pruned_transducer_stateless

fix typos (#409 )

2022-06-08 20:08:44 +08:00

pruned_transducer_stateless2

Emformer with conv module and scaling mechanism (#389 )

2022-06-13 15:09:17 +08:00

pruned_transducer_stateless3

Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )

2022-06-09 11:42:18 +08:00

pruned_transducer_stateless4

fix typos (#409 )

2022-06-08 20:08:44 +08:00

pruned_transducer_stateless5

fix typos (#409 )

2022-06-08 20:08:44 +08:00

pruned_transducer_stateless6

fix typos (#409 )

2022-06-08 20:08:44 +08:00

streaming_conformer_ctc

Reset seed at the beginning of each epoch. (#221 )

2022-02-21 15:16:39 +08:00

tdnn_lstm_ctc

Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )

2022-06-09 11:42:18 +08:00

transducer

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_lstm

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_stateless

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_stateless2

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_stateless_multi_datasets

Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )

2022-06-09 11:42:18 +08:00

.gitignore

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

distillation_with_hubert.sh

[Ready to merge]stateless6: states4 + hubert distillation. (#387 )

2022-05-28 12:37:50 +08:00

prepare_giga_speech.sh

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

prepare.sh

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

README.md

Emformer with conv module and scaling mechanism (#389 )

2022-06-13 15:09:17 +08:00

RESULTS-100hours.md

[Ready to merge]stateless6: states4 + hubert distillation. (#387 )

2022-05-28 12:37:50 +08:00

RESULTS.md

Emformer with conv module and scaling mechanism (#389 )

2022-06-13 15:09:17 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data
`pruned_transducer_stateless4`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless2 + save averaged models periodically during training
`pruned_transducer_stateless5`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + more layers + random combiner
`pruned_transducer_stateless6`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + distillation with hubert
`pruned_stateless_emformer_rnnt2`	Emformer(from torchaudio)	Embedding + Conv1d	Using Emformer from torchaudio for streaming ASR
`conv_emformer_transducer_stateless`	Emformer	Embedding + Conv1d	Using Emformer augmented with convolution for streaming ASR + mechanisms in reworked model

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.