From 556a3f094125a529488714cd6671754919faeab5 Mon Sep 17 00:00:00 2001 From: Bailey Machiko Hirota <53164945+baileyeet@users.noreply.github.com> Date: Thu, 14 Aug 2025 17:02:44 +0900 Subject: [PATCH] Update README.md --- egs/mls_english/ASR/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/egs/mls_english/ASR/README.md b/egs/mls_english/ASR/README.md index bacc237db..cb8f51f46 100644 --- a/egs/mls_english/ASR/README.md +++ b/egs/mls_english/ASR/README.md @@ -5,7 +5,6 @@ **Multilingual LibriSpeech (MLS)** is a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages - English, German, Dutch, Spanish, French, Italian, Portuguese, Polish. It includes about 44.5K hours of English and a total of about 6K hours for other languages. This icefall training recipe was created for the restructured version of the English split of the dataset available on Hugging Face below. - The dataset is available on Hugging Face. For more details, please visit: - Dataset: https://huggingface.co/datasets/parler-tts/mls_eng @@ -14,6 +13,7 @@ The dataset is available on Hugging Face. For more details, please visit: ## On-the-fly feature computation -This recipe currently only supports on-the-fly feature bank computation, since `lhotse` manifests and feature banks are not pre-calculated in this recipe. This should mean that the dataset can be streamed from Hugging Face, but we have not tested this yet. We may add a version that supports pre-calculating features to better match existing recipes. +This recipe currently only supports on-the-fly feature bank computation, since `lhotse` manifests and feature banks are not pre-calculated in this recipe. This should mean that the dataset can be streamed from Hugging Face, but we have not tested this yet. We may add a version that supports pre-calculating features to better match existing recipes.\ +
- +[./RESULTS.md](./RESULTS.md) contains the latest results. This MLS English recipe was primarily developed for use in the ```multi_ja_en``` Japanese-English bilingual pipeline, which is based on MLS English and ReazonSpeech.