60 Commits

Author SHA1 Message Date
Bailey Machiko Hirota
8c846399a5
Update egs/mls_english/ASR/zipformer/streaming_decode.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-11 15:32:18 +09:00
Kinan Martin
ef7664e7cf
Update egs/mls_english/ASR/local/utils/asr_datamodule.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-02 18:07:06 +09:00
Bailey Hirota
a4c1db5a49 reformat 2025-09-02 11:42:24 +09:00
Bailey Machiko Hirota
556a3f0941
Update README.md 2025-08-14 17:02:44 +09:00
Bailey Machiko Hirota
8e186160d1
Update RESULTS.md 2025-08-14 16:05:38 +09:00
Bailey Machiko Hirota
8c08c9c902
Create RESULTS.md 2025-08-14 11:53:16 +09:00
Bailey Hirota
5400f4315d training and decoding compatibility changes 2025-08-11 15:37:49 +09:00
Bailey Machiko Hirota
130c2a59c3
Merge branch 'multi_ja_en_mls_english_clean' into musan-mls-clean-final 2025-08-06 11:45:20 +09:00
Bailey Hirota
4e05d70f45 fix stash commit 2025-08-06 11:38:38 +09:00
Bailey Hirota
c23af2ea1a musan implementation for mls_english 2025-08-05 19:18:34 +09:00
Bailey Hirota
76bae70132 remove commented out codels 2025-08-05 18:47:34 +09:00
Bailey Hirota
0a4ed5e636 add step 4: display manifest stats to mls_eng 2025-08-05 18:43:56 +09:00
Kinan Martin
9c318da803 Add failsafe for MLS English dev set key alternate name as validation 2025-08-05 18:41:22 +09:00
Kinan Martin
b6d43a40ac Parametrize dev and test split sizes. 2025-08-05 18:40:08 +09:00
Kinan Martin
d136086d6b add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech 2025-08-05 18:38:55 +09:00
Kinan Martin
1093e78612 use huggingface_hub library to download mls_english 2025-08-05 18:34:15 +09:00
Kinan Martin
5682978c64 switch mls_english clone from https to ssh 2025-08-05 18:33:03 +09:00
Kinan Martin
7bea23e954 restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5 2025-08-05 18:30:41 +09:00
Bailey Hirota
31a37c7e44 Revert "add fbank"
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
2025-08-05 18:15:04 +09:00
Bailey Hirota
7d462aa8b4 add fbank 2025-08-05 18:13:51 +09:00
Kinan Martin
0e86ef805c optimize with num_jobs on save_audios 2025-08-05 18:11:22 +09:00
Kinan Martin
73dea24fd9 fix stage 2 and 3 2025-08-05 18:10:15 +09:00
Kinan Martin
2504b23861 fix validation manifest name 2025-08-05 18:09:00 +09:00
Kinan Martin
eb2168bc49 adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files 2025-08-05 18:07:45 +09:00
Kinan Martin
a8f45bc08b move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech 2025-08-05 18:06:28 +09:00
Kinan Martin
fe88d1db36 instead of on-the-fly features, precompute fbank and manifests in prepare.sh 2025-08-05 18:05:18 +09:00
Kinan Martin
996334f520 readme 2025-08-05 18:04:04 +09:00
Kinan Martin
24db8c11ba pre-commit hooks 2025-08-05 18:02:50 +09:00
Kinan Martin
c532a503e7 separate transcript prep stage from bpe train stage 2025-08-05 18:01:41 +09:00
Kinan Martin
313afea773 symlink copied files to librispeech recipe dir 2025-08-05 18:00:24 +09:00
Kinan Martin
e76b749450 cleaned-up version of recipe 2025-08-05 17:59:15 +09:00
Kinan Martin
1b8a3061b0 replace file 2025-08-05 17:57:59 +09:00
Kinan Martin
0ab027411f change default path 2025-08-05 17:56:51 +09:00
Kinan Martin
ba6d8e8b26 update prepare.sh, fix asr_datamodule.py 2025-08-05 17:55:40 +09:00
Kinan Martin
c92c606c5f WIP v0 MLS English recipe 2025-08-05 17:54:30 +09:00
Bailey Hirota
ddc2daaccd remove commented out codels 2025-07-28 17:52:36 +09:00
Bailey Hirota
c77a8470f5 add step 4: display manifest stats to mls_eng 2025-07-28 17:52:36 +09:00
Kinan Martin
78ee595b45 Add failsafe for MLS English dev set key alternate name as validation 2025-07-28 17:52:36 +09:00
Kinan Martin
ad1be22919 Parametrize dev and test split sizes. 2025-07-28 17:52:36 +09:00
Kinan Martin
b167ac7b40 add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech 2025-07-28 17:52:36 +09:00
Kinan Martin
a8ecb16d47 use huggingface_hub library to download mls_english 2025-07-28 17:52:36 +09:00
Kinan Martin
f4b29870a0 switch mls_english clone from https to ssh 2025-07-28 17:52:36 +09:00
Kinan Martin
5417e0926b restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5 2025-07-28 17:52:36 +09:00
Bailey Hirota
61e81bfc26 Revert "add fbank"
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
2025-07-28 17:49:35 +09:00
Bailey Hirota
c83b115b49 add fbank 2025-07-28 17:49:35 +09:00
Kinan Martin
fa84782b21 optimize with num_jobs on save_audios 2025-07-28 17:49:35 +09:00
Kinan Martin
f2e01712de fix stage 2 and 3 2025-07-28 17:49:35 +09:00
Kinan Martin
59519a41fa fix validation manifest name 2025-07-28 17:49:35 +09:00
Kinan Martin
4ca8ee94f0 adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files 2025-07-28 17:49:35 +09:00
Kinan Martin
d6e3c98e58 move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech 2025-07-28 17:49:35 +09:00