mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-09 22:15:28 +00:00

History

Use ScaledLSTM as streaming encoder (#479 )

* add ScaledLSTM

* add RNNEncoderLayer and RNNEncoder classes in lstm.py

* add RNN and Conv2dSubsampling classes in lstm.py

* hardcode bidirectional=False

* link from pruned_transducer_stateless2

* link scaling.py pruned_transducer_stateless2

* copy from pruned_transducer_stateless2

* modify decode.py pretrained.py test_model.py train.py

* copy streaming decoding files from pruned_transducer_stateless2

* modify streaming decoding files

* simplified code in ScaledLSTM

* flat weights after scaling

* pruned2 -> pruned4

* link __init__.py

* fix style

* remove add_model_arguments

* modify .flake8

* fix style

* fix scale value in scaling.py

* add random combiner for training deeper model

* add using proj_size

* add scaling converter for ScaledLSTM

* support jit trace

* add using averaged model in export.py

* modify test_model.py, test if the model can be successfully exported by jit.trace

* modify pretrained.py

* support streaming decoding

* fix model.py

* Add cut_id to recognition results

* Add cut_id to recognition results

* do not pad in Conv subsampling module; add tail padding during decoding.

* update RESULTS.md

* minor fix

* fix doc

* update README.md

* minor change, filter infinite loss

* remove the condition of raise error

* modify type hint for the return value in model.py

* minor change

* modify RESULTS.md

Co-authored-by: pkufool <wkang.pku@gmail.com>

2022-08-19 14:38:45 +08:00

conformer_ctc

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

conformer_ctc2

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

conformer_mmi

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

conv_emformer_transducer_stateless

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

conv_emformer_transducer_stateless2

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

local

Set overwrite=True when extracting features in batches. (#487 )

2022-07-29 11:17:19 +08:00

lstm_transducer_stateless

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

pruned_stateless_emformer_rnnt2

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

pruned_transducer_stateless

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

pruned_transducer_stateless2

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

pruned_transducer_stateless3

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

pruned_transducer_stateless4

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

pruned_transducer_stateless5

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

pruned_transducer_stateless6

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

streaming_conformer_ctc

fix about tensorboard (#516 )

2022-08-04 19:57:12 +08:00

tdnn_lstm_ctc

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

transducer

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

transducer_lstm

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

transducer_stateless

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

transducer_stateless2

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

transducer_stateless_multi_datasets

Sort results to make it more convenient to compare decoding results (#522 )

2022-08-12 07:12:50 +08:00

.gitignore

Modified conformer with multi datasets (#312 )

2022-04-29 15:40:30 +08:00

distillation_with_hubert.sh

update multi_quantization installation (#469 )

2022-07-13 21:16:45 +08:00

prepare_giga_speech.sh

Use jsonl for CutSet in the LibriSpeech recipe. (#397 )

2022-06-06 10:19:16 +08:00

prepare.sh

[Ready to be merged] Add RNN-LM to Conformer-CTC decoding (#439 )

2022-06-23 19:37:03 +08:00

README.md

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

RESULTS-100hours.md

[Ready to merge]stateless6: states4 + hubert distillation. (#387 )

2022-05-28 12:37:50 +08:00

RESULTS.md

Use ScaledLSTM as streaming encoder (#479 )

2022-08-19 14:38:45 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data
`pruned_transducer_stateless4`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless2 + save averaged models periodically during training
`pruned_transducer_stateless5`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + more layers + random combiner
`pruned_transducer_stateless6`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + distillation with hubert
`pruned_stateless_emformer_rnnt2`	Emformer(from torchaudio)	Embedding + Conv1d	Using Emformer from torchaudio for streaming ASR
`conv_emformer_transducer_stateless`	ConvEmformer	Embedding + Conv1d	Using ConvEmformer for streaming ASR + mechanisms in reworked model
`conv_emformer_transducer_stateless2`	ConvEmformer	Embedding + Conv1d	Using ConvEmformer with simplified memory for streaming ASR + mechanisms in reworked model
`lstm_transducer_stateless`	LSTM	Embedding + Conv1d	Using LSTM with mechanisms in reworked model

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.