mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-26 18:24:18 +00:00
This recipe is mostly based on egs/csj, but tweaked to the point that can be run with ReazonSpeech corpus. That being said, there are some big caveats: * Currently the model quality is not very good. Actually, it is very bad. I trained a model with 1000h corpus, and it resulted in >80% CER on JSUT. * The core issue seems that Zipformer is prone to ignore untterances as sielent segments. It often produces an empty hypothesis despite that the audio actually contains human voice. * This issue is already reported in the upstream and not fully resolved yet as of Dec 2023. Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
Symbolic link
1 line
74 B
Python
Symbolic link
1 line
74 B
Python
../../../librispeech/ASR/pruned_transducer_stateless7_streaming/decoder.py |