mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-23 00:36:14 +00:00
No need to load_audio. alimeeting audio data is wav format, default export_to_webdataset uses "flac". If load_audio uses default (True), webdataset would show "[Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'" during write
Introduction
This recipe includes some different ASR models trained with Alimeeting (far).
./RESULTS.md contains the latest results.
Transducers
There are various folders containing the name transducer
in this folder.
The following table lists the differences among them.
Encoder | Decoder | Comment | |
---|---|---|---|
pruned_transducer_stateless2 |
Conformer(modified) | Embedding + Conv1d | Using k2 pruned RNN-T loss |
The decoder in transducer_stateless
is modified from the paper
Rnn-Transducer with Stateless Prediction Network.
We place an additional Conv1d layer right after the input embedding layer.