mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-12-09 14:05:33 +00:00
Fixes incorrect computation of encoder_dim when encoder_dim is a comma-separated list of integers by ensuring numeric (not lexicographic) max is used. Fixes #2018 - Replace int(max(params.encoder_dim.split(","))) (lexicographic max on strings) with max(_to_int_tuple(params.encoder_dim)) (numeric max). - Apply the fix consistently across all affected training scripts.
Introduction
This recipe includes some different ASR models trained with TedLium3.
Transducers
There are various folders containing the name transducer in this folder.
The following table lists the differences among them.
| Encoder | Decoder | Comment | |
|---|---|---|---|
transducer_stateless |
Conformer | Embedding + Conv1d | |
pruned_transducer_stateless |
Conformer | Embedding + Conv1d | Using k2 pruned RNN-T loss |
The decoder in transducer_stateless is modified from the paper
Rnn-Transducer with Stateless Prediction Network.
We place an additional Conv1d layer right after the input embedding layer.