Bailey Machiko Hirota
|
9d93d63cf2
|
Update RESULTS.md
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
dc4db379ea
|
PR review suggestions implemented
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
6012edbc17
|
black and isort formatting
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Machiko Hirota
|
154ef43206
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Machiko Hirota
|
f7fec4a6e7
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Machiko Hirota
|
542620c4e3
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Machiko Hirota
|
310aaec3cc
|
Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
aee7b87adb
|
working changes for musan mixing
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
d5cc0301d4
|
attempt to fix musan paths
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
0f700ed0b2
|
update musan symlinks
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
093a035935
|
update musan paths
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
4e92879751
|
update musan path
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
f51621b374
|
resolve typos and import issues
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
de35cc2760
|
remove comment
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
5ec9389909
|
commenting
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
df923f3a16
|
typos
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
70a7940c95
|
changes to asr_datamodule for musan support
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
5f2f6843c9
|
make prepare.sh symlinks relative
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
19b62c008d
|
remove unused local scripts
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
f6ad423398
|
changes to train script - no need for limiting utterance length here
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
ddc2daaccd
|
remove commented out codels
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
f3e59dfa4c
|
add stage 6 - update cutset paths to prepare
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
cdf246ca1c
|
update manifest dir path
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
c77a8470f5
|
add step 4: display manifest stats to mls_eng
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
fd3fbe6454
|
Update README.md to reflect MLS English dataset
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
78ee595b45
|
Add failsafe for MLS English dev set key alternate name as validation
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
ad1be22919
|
Parametrize dev and test split sizes.
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
b167ac7b40
|
add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
eafbd6429b
|
add utility file for updating the storage_path of cutsets for use in the multilingual training recipe directory structure
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
2f1c61124a
|
fix decode script data module usage
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
3307836352
|
Combined updates. Changed BBPE path structure, changed dataset path structure, added script to update cutset paths. WIP
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
a8ecb16d47
|
use huggingface_hub library to download mls_english
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
f4b29870a0
|
switch mls_english clone from https to ssh
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
782e1fb958
|
fix stage 5 output pathing
|
2025-07-28 17:52:36 +09:00 |
|
Kinan Martin
|
5417e0926b
|
restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5
|
2025-07-28 17:52:36 +09:00 |
|
Bailey Hirota
|
6d71d9cff4
|
remove bilingual tag from train.py
|
2025-07-28 17:52:28 +09:00 |
|
Kinan Martin
|
3751441dad
|
deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet
|
2025-07-28 17:49:35 +09:00 |
|
Bailey Hirota
|
61e81bfc26
|
Revert "add fbank"
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
|
2025-07-28 17:49:35 +09:00 |
|
Bailey Hirota
|
c83b115b49
|
add fbank
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
abebb6aaf0
|
new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
fa84782b21
|
optimize with num_jobs on save_audios
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
f2e01712de
|
fix stage 2 and 3
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
59519a41fa
|
fix validation manifest name
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
4ca8ee94f0
|
adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
d6e3c98e58
|
move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
68e3ceaaac
|
instead of on-the-fly features, precompute fbank and manifests in prepare.sh
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
ce44150e25
|
readme
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
a34d34a38e
|
pre-commit hooks
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
898525962c
|
separate transcript prep stage from bpe train stage
|
2025-07-28 17:49:35 +09:00 |
|
Kinan Martin
|
8c1c7100d3
|
symlink copied files to librispeech recipe dir
|
2025-07-28 17:49:35 +09:00 |
|