* add egs/xbmu_amdo31 * fix xbmu_amdo31/ASR/pruned_transducer_stateless5/train.py * fix xbmu_amdo31/ASR/pruned_transducer_stateless5/asr_datamodule.py * fix xbmu_amdo31/ASR/prepare.sh * add RESULTS.md and README.md * dix pruned_transducer_stateless5 decode.py * add transducer stateless7 * fix transducer_stateless7 * fix RESULTS.md error * Add pruned_transducer_stateless7 validation set results
Introduction
About the XBMU-AMDO31 corpus XBMU-AMDO31 is an open-source Amdo Tibetan speech corpus published by Northwest Minzu University. publicly available on https://huggingface.co/datasets/syzym/xbmu_amdo31
XBMU-AMDO31 dataset is a speech recognition corpus of Amdo Tibetan dialect. The open source corpus contains 31 hours of speech data and resources related to build speech recognition systems,including transcribed texts and a Tibetan pronunciation lexicon. (The lexicon is a Tibetan lexicon of the Lhasa dialect, which has been reused for the Amdo dialect because of the uniformity of the Tibetan language) The dataset can be used to train a model for Amdo Tibetan Automatic Speech Recognition (ASR).
This recipe includes some different ASR models trained with XBMU-AMDO31.
./RESULTS.md contains the latest results.