Bailey Machiko Hirota
|
2f1f419149
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-08-05 19:09:12 +09:00 |
|
Bailey Machiko Hirota
|
b19929c302
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-08-05 19:07:58 +09:00 |
|
Bailey Machiko Hirota
|
865b859e5d
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-08-05 19:06:45 +09:00 |
|
Bailey Machiko Hirota
|
95f58e69fd
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-08-05 19:05:35 +09:00 |
|
Bailey Hirota
|
60f326bf63
|
working changes for musan mixing
|
2025-08-05 19:04:24 +09:00 |
|
Bailey Hirota
|
a310d8fd5b
|
attempt to fix musan paths
|
2025-08-05 19:03:13 +09:00 |
|
Bailey Hirota
|
44758153df
|
update musan symlinks
|
2025-08-05 19:02:04 +09:00 |
|
Bailey Hirota
|
aeffb15dab
|
update musan paths
|
2025-08-05 19:00:55 +09:00 |
|
Bailey Hirota
|
6272827db3
|
update musan path
|
2025-08-05 18:59:44 +09:00 |
|
Bailey Hirota
|
c610c6d56a
|
resolve typos and import issues
|
2025-08-05 18:58:34 +09:00 |
|
Bailey Hirota
|
1cf544b513
|
remove comment
|
2025-08-05 18:57:21 +09:00 |
|
Bailey Hirota
|
5fb4bdf9e7
|
commenting
|
2025-08-05 18:56:05 +09:00 |
|
Bailey Hirota
|
ed2c0a4597
|
typos
|
2025-08-05 18:54:52 +09:00 |
|
Bailey Hirota
|
199650781f
|
changes to asr_datamodule for musan support
|
2025-08-05 18:53:37 +09:00 |
|
Fangjun Kuang
|
d7ee48e879
|
Validate generated manifest files. (#338)
|
2025-08-05 18:52:31 +09:00 |
|
Kinan Martin
|
694ecb907a
|
make prepare.sh symlinks relative
|
2025-08-05 18:51:16 +09:00 |
|
Bailey Hirota
|
9c91775a51
|
remove unused local scripts
|
2025-08-05 18:49:58 +09:00 |
|
Bailey Hirota
|
ac94174215
|
changes to train script - no need for limiting utterance length here
|
2025-08-05 18:48:48 +09:00 |
|
Bailey Hirota
|
76bae70132
|
remove commented out codels
|
2025-08-05 18:47:34 +09:00 |
|
Bailey Hirota
|
606789b8f4
|
add stage 6 - update cutset paths to prepare
|
2025-08-05 18:46:18 +09:00 |
|
Bailey Hirota
|
1ddd3cdcf8
|
update manifest dir path
|
2025-08-05 18:45:09 +09:00 |
|
Bailey Hirota
|
0a4ed5e636
|
add step 4: display manifest stats to mls_eng
|
2025-08-05 18:43:56 +09:00 |
|
Kinan Martin
|
065ca315c8
|
Update README.md to reflect MLS English dataset
|
2025-08-05 18:42:41 +09:00 |
|
Kinan Martin
|
9c318da803
|
Add failsafe for MLS English dev set key alternate name as validation
|
2025-08-05 18:41:22 +09:00 |
|
Kinan Martin
|
b6d43a40ac
|
Parametrize dev and test split sizes.
|
2025-08-05 18:40:08 +09:00 |
|
Kinan Martin
|
d136086d6b
|
add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech
|
2025-08-05 18:38:55 +09:00 |
|
Kinan Martin
|
b25254f0c9
|
add utility file for updating the storage_path of cutsets for use in the multilingual training recipe directory structure
|
2025-08-05 18:37:41 +09:00 |
|
Kinan Martin
|
68bff93940
|
fix decode script data module usage
|
2025-08-05 18:36:27 +09:00 |
|
Kinan Martin
|
1b1a317603
|
Combined updates. Changed BBPE path structure, changed dataset path structure, added script to update cutset paths. WIP
|
2025-08-05 18:35:20 +09:00 |
|
Kinan Martin
|
1093e78612
|
use huggingface_hub library to download mls_english
|
2025-08-05 18:34:15 +09:00 |
|
Kinan Martin
|
5682978c64
|
switch mls_english clone from https to ssh
|
2025-08-05 18:33:03 +09:00 |
|
Kinan Martin
|
2265e1afed
|
fix stage 5 output pathing
|
2025-08-05 18:31:48 +09:00 |
|
Kinan Martin
|
7bea23e954
|
restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5
|
2025-08-05 18:30:41 +09:00 |
|
Bailey Hirota
|
8b035a0c96
|
remove bilingual tag from train.py
|
2025-08-05 18:29:29 +09:00 |
|
Kinan Martin
|
99db0e4643
|
deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet
|
2025-08-05 18:16:17 +09:00 |
|
Bailey Hirota
|
31a37c7e44
|
Revert "add fbank"
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
|
2025-08-05 18:15:04 +09:00 |
|
Bailey Hirota
|
7d462aa8b4
|
add fbank
|
2025-08-05 18:13:51 +09:00 |
|
Kinan Martin
|
06e429131b
|
new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English
|
2025-08-05 18:12:40 +09:00 |
|
Kinan Martin
|
0e86ef805c
|
optimize with num_jobs on save_audios
|
2025-08-05 18:11:22 +09:00 |
|
Kinan Martin
|
73dea24fd9
|
fix stage 2 and 3
|
2025-08-05 18:10:15 +09:00 |
|
Kinan Martin
|
2504b23861
|
fix validation manifest name
|
2025-08-05 18:09:00 +09:00 |
|
Kinan Martin
|
eb2168bc49
|
adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files
|
2025-08-05 18:07:45 +09:00 |
|
Kinan Martin
|
a8f45bc08b
|
move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech
|
2025-08-05 18:06:28 +09:00 |
|
Kinan Martin
|
fe88d1db36
|
instead of on-the-fly features, precompute fbank and manifests in prepare.sh
|
2025-08-05 18:05:18 +09:00 |
|
Kinan Martin
|
996334f520
|
readme
|
2025-08-05 18:04:04 +09:00 |
|
Kinan Martin
|
24db8c11ba
|
pre-commit hooks
|
2025-08-05 18:02:50 +09:00 |
|
Kinan Martin
|
c532a503e7
|
separate transcript prep stage from bpe train stage
|
2025-08-05 18:01:41 +09:00 |
|
Kinan Martin
|
313afea773
|
symlink copied files to librispeech recipe dir
|
2025-08-05 18:00:24 +09:00 |
|
Kinan Martin
|
e76b749450
|
cleaned-up version of recipe
|
2025-08-05 17:59:15 +09:00 |
|
Kinan Martin
|
1b8a3061b0
|
replace file
|
2025-08-05 17:57:59 +09:00 |
|