mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-08 17:42:21 +00:00
* disable speed perturbation by default
* minor fixes
* minor updates
* updated bash scripts to incorporate with the `speed-perturb` arg
* minor fixes
1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe
>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)
2. changed arg type for `perturb-speed` to str2bool
Introduction
Please refer to https://icefall.readthedocs.io/en/latest/recipes/Non-streaming-ASR/aishell/index.html for how to run models in this recipe.
Transducers
There are various folders containing the name transducer
in this folder.
The following table lists the differences among them.
Encoder | Decoder | Comment | |
---|---|---|---|
transducer_stateless |
Conformer | Embedding + Conv1d | with k2.rnnt_loss |
transducer_stateless_modified |
Conformer | Embedding + Conv1d | with modified transducer from optimized_transducer |
transducer_stateless_modified-2 |
Conformer | Embedding + Conv1d | with modified transducer from optimized_transducer + extra data |
pruned_transducer_stateless3 |
Conformer (reworked) | Embedding + Conv1d | pruned RNN-T + reworked model with random combiner + using aidatatang_20zh as extra data |
pruned_transducer_stateless7 |
Zipformer | Embedding | pruned RNN-T + zipformer encoder + stateless decoder with context-size 1 |
The decoder in transducer_stateless
is modified from the paper
Rnn-Transducer with Stateless Prediction Network.
We place an additional Conv1d layer right after the input embedding layer.