History

Minor fixes to the RNN-T Conformer model (#152 )

* Disable weight decay.

* Remove input feature batchnorm..

* Replace BatchNorm in the Conformer model with LayerNorm.

* Use tanh in the joint network.

* Remove sos ID.

* Reduce the number of decoder layers from 4 to 2.

* Minor fixes.

* Fix typos.

2021-12-23 13:54:25 +08:00

conformer_ctc

RNN-T Conformer training for LibriSpeech (#143 )

2021-12-18 07:42:51 +08:00

conformer_mmi

Set fsa.properties to None after changing its labels in-place. (#121 )

2021-11-16 23:11:30 +08:00

local

RNN-T Conformer training for LibriSpeech (#143 )

2021-12-18 07:42:51 +08:00

streaming_conformer_ctc

Draft streaming decoding (#89 )

2021-11-24 19:35:18 +08:00

tdnn_lstm_ctc

Associate a cut with token alignment (without repeats) (#125 )

2021-11-29 18:50:54 +08:00

transducer

Minor fixes to the RNN-T Conformer model (#152 )

2021-12-23 13:54:25 +08:00

transducer_lstm

Increase the size of the context in the RNN-T decoder. (#153 )

2021-12-23 07:55:02 +08:00

transducer_stateless

Increase the size of the context in the RNN-T decoder. (#153 )

2021-12-23 07:55:02 +08:00

prepare.sh

Add MMI training with word pieces as modelling unit. (#6 )

2021-10-18 15:20:32 +08:00

README.md

Increase the size of the context in the RNN-T decoder. (#153 )

2021-12-23 07:55:02 +08:00

RESULTS.md

Minor fixes to the RNN-T Conformer model (#152 )

2021-12-23 13:54:25 +08:00

shared

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

README.md

Introduction

Please refer to https://icefall.readthedocs.io/en/latest/recipes/librispeech.html for how to run models in this recipe.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d
`transducer_lstm`	LSTM	LSTM

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.