Archived

This repository has been archived on 2026-03-23. You can view files and clone it, but cannot push or open issues or pull requests.

History

Wei Kang 6e609c67a2

Using streaming conformer as transducer encoder (#380 )

* support streaming in conformer

* Add more documents

* support streaming on pruned_transducer_stateless2; add delay penalty; fixes for decode states

* Minor fixes

* streaming for pruned_transducer_stateless4

* Fix conv cache error, support async streaming decoding

* Fix style

* Fix style

* Fix style

* Add torch.jit.export

* mask the initial cache

* Cutting off invalid frames of encoder_embed output

* fix relative positional encoding in streaming decoding for compution saving

* Minor fixes

* Minor fixes

* Minor fixes

* Minor fixes

* Minor fixes

* Fix jit export for torch 1.6

* Minor fixes for streaming decoding

* Minor fixes on decode stream

* move model parameters to train.py

* make states in forward streaming optional

* update pretrain to support streaming model

* update results.md

* update tensorboard and pre-models

* fix typo

* Fix tests

* remove unused arguments

* add streaming decoding ci

* Minor fix

* Minor fix

* disable right context by default

2022-06-28 00:18:54 +08:00

conformer_ctc

fix typo (#445 )

2022-06-25 11:00:53 +08:00

conformer_mmi

fix typo (#445 )

2022-06-25 11:00:53 +08:00

conv_emformer_transducer_stateless

Fix warmup (#435 )

2022-06-20 13:40:01 +08:00

local

[Ready to be merged] Add RNN-LM to Conformer-CTC decoding (#439 )

2022-06-23 19:37:03 +08:00

pruned_stateless_emformer_rnnt2

Fix exporting emformer with torchscript using torch 1.6.0 (#402 )

2022-06-07 09:19:37 +08:00

pruned_transducer_stateless

Using streaming conformer as transducer encoder (#380 )

2022-06-28 00:18:54 +08:00

pruned_transducer_stateless2

Using streaming conformer as transducer encoder (#380 )

2022-06-28 00:18:54 +08:00

pruned_transducer_stateless3

Using streaming conformer as transducer encoder (#380 )

2022-06-28 00:18:54 +08:00

pruned_transducer_stateless4

Using streaming conformer as transducer encoder (#380 )

2022-06-28 00:18:54 +08:00

pruned_transducer_stateless5

fix typo (#445 )

2022-06-25 11:00:53 +08:00

pruned_transducer_stateless6

fix typo (#445 )

2022-06-25 11:00:53 +08:00

streaming_conformer_ctc

fix typo (#445 )

2022-06-25 11:00:53 +08:00

tdnn_lstm_ctc

Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )

2022-06-09 11:42:18 +08:00

transducer

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_lstm

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_stateless

Using streaming conformer as transducer encoder (#380 )

2022-06-28 00:18:54 +08:00

transducer_stateless2

fix typos (#409 )

2022-06-08 20:08:44 +08:00

transducer_stateless_multi_datasets

Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )

2022-06-09 11:42:18 +08:00

.gitignore

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

distillation_with_hubert.sh

Upload extracted codebook indexes (#429 )

2022-06-21 19:16:59 +08:00

prepare_giga_speech.sh

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

prepare.sh

[Ready to be merged] Add RNN-LM to Conformer-CTC decoding (#439 )

2022-06-23 19:37:03 +08:00

README.md

Emformer with conv module and scaling mechanism (#389 )

2022-06-13 15:09:17 +08:00

RESULTS-100hours.md

[Ready to merge]stateless6: states4 + hubert distillation. (#387 )

2022-05-28 12:37:50 +08:00

RESULTS.md

Using streaming conformer as transducer encoder (#380 )

2022-06-28 00:18:54 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data
`pruned_transducer_stateless4`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless2 + save averaged models periodically during training
`pruned_transducer_stateless5`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + more layers + random combiner
`pruned_transducer_stateless6`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + distillation with hubert
`pruned_stateless_emformer_rnnt2`	Emformer(from torchaudio)	Embedding + Conv1d	Using Emformer from torchaudio for streaming ASR
`conv_emformer_transducer_stateless`	Emformer	Embedding + Conv1d	Using Emformer augmented with convolution for streaming ASR + mechanisms in reworked model

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.