mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-09 10:02:22 +00:00
readme
This commit is contained in:
parent
bb6d672b54
commit
4e2a4fdcd8
19
egs/mls_english/ASR/README.md
Normal file
19
egs/mls_english/ASR/README.md
Normal file
@ -0,0 +1,19 @@
|
||||
# Introduction
|
||||
|
||||
|
||||
|
||||
**Multilingual LibriSpeech (MLS)** is a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages - English, German, Dutch, Spanish, French, Italian, Portuguese, Polish. It includes about 44.5K hours of English and a total of about 6K hours for other languages. This icefall training recipe was created for the restructured version of the English split of the dataset available on Hugging Face below.
|
||||
|
||||
|
||||
|
||||
The dataset is available on Hugging Face. For more details, please visit:
|
||||
|
||||
- Dataset: https://huggingface.co/datasets/parler-tts/mls_eng
|
||||
- Original MLS dataset link: https://www.openslr.org/94
|
||||
|
||||
|
||||
## On-the-fly feature computation
|
||||
|
||||
This recipe currently only supports on-the-fly feature bank computation, since `lhotse` manifests and feature banks are not pre-calculated in this recipe. This should mean that the dataset can be streamed from Hugging Face, but we have not tested this yet. We may add a version that supports pre-calculating features to better match existing recipes.
|
||||
|
||||
<!-- [./RESULTS.md](./RESULTS.md) contains the latest results. -->
|
Loading…
x
Reference in New Issue
Block a user