1239 Commits

Author SHA1 Message Date
Bailey Hirota
4b634602d6 update musan path 2025-07-10 13:23:30 +09:00
Bailey Hirota
6ef0fec6e8 resolve typos and import issues 2025-07-09 14:06:29 +09:00
Bailey Hirota
8b25152edf remove comment 2025-07-04 15:40:14 +09:00
Bailey Hirota
f6bae95ebd commenting 2025-07-01 21:21:25 +09:00
Bailey Hirota
55d0664339 typos 2025-07-01 21:04:18 +09:00
Bailey Hirota
d8cb41f4f6 changes to asr_datamodule for musan support 2025-07-01 18:18:25 +09:00
Bailey Hirota
252e5eb2e1 remove unused local scripts 2025-06-13 00:49:40 +09:00
Bailey Hirota
fe9f975ec2 changes to train script - no need for limiting utterance length here 2025-06-13 00:48:37 +09:00
Bailey Hirota
e1f140a50e remove commented out codels 2025-06-13 00:33:47 +09:00
Bailey Hirota
78d4e50d0f add stage 6 - update cutset paths to prepare 2025-06-12 00:21:52 +09:00
Bailey Hirota
da75835639 update manifest dir path 2025-06-12 00:20:41 +09:00
Bailey Hirota
5a120cbcb3 add step 4: display manifest stats to mls_eng 2025-06-11 18:06:29 +09:00
Kinan Martin
003e94fac2 Update README.md to reflect MLS English dataset 2025-06-11 09:19:07 +09:00
Kinan Martin
c7c74b8658 Add failsafe for MLS English dev set key alternate name as validation 2025-06-11 09:18:28 +09:00
Kinan Martin
c8d932b0c2 Parametrize dev and test split sizes. 2025-06-10 10:11:33 +09:00
Kinan Martin
a6f60de9dd add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech 2025-06-06 11:44:27 +09:00
Kinan Martin
052fcc3218 add utility file for updating the storage_path of cutsets for use in the multilingual training recipe directory structure 2025-06-06 11:42:08 +09:00
Kinan Martin
6255ba5cb2 fix decode script data module usage 2025-06-06 11:29:29 +09:00
Kinan Martin
ce894a7ba2 Combined updates. Changed BBPE path structure, changed dataset path structure, added script to update cutset paths. WIP 2025-06-04 10:12:39 +09:00
Kinan Martin
1f11ba4d28 use huggingface_hub library to download mls_english 2025-05-22 09:15:12 +09:00
Kinan Martin
f3f04fa626 switch mls_english clone from https to ssh 2025-05-21 10:25:47 +09:00
Kinan Martin
e6615df4eb fix stage 5 output pathing 2025-05-15 09:11:40 +09:00
Kinan Martin
daff070d68 restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5 2025-05-15 07:24:26 +09:00
Kinan Martin
e34f2dbb2a merge change to remove bilingual param with new multidataset_datamodule 2025-05-14 08:51:11 +09:00
Kinan Martin
eb5004880f deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet 2025-05-14 08:41:03 +09:00
Bailey Hirota
7ef1811063 remove bilingual tag from train.py 2025-05-14 08:37:44 +09:00
Bailey Hirota
b2df5bbb83 Revert "add fbank"
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
2025-05-13 09:43:17 +09:00
Bailey Hirota
82bd37cacd add fbank 2025-05-13 09:43:05 +09:00
Kinan Martin
21d1bf73bb new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English 2025-05-09 10:57:41 +09:00
Kinan Martin
547f5c5cfb optimize with num_jobs on save_audios 2025-05-02 07:22:38 +09:00
Kinan Martin
88249f0eb4 fix stage 2 and 3 2025-05-01 08:15:07 +09:00
Kinan Martin
90326c1f43 fix validation manifest name 2025-05-01 08:05:42 +09:00
Kinan Martin
dbe270ba94 adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files 2025-04-30 10:06:13 +09:00
Kinan Martin
cf425173af move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech 2025-04-24 09:39:54 +09:00
Kinan Martin
4f743993ef instead of on-the-fly features, precompute fbank and manifests in prepare.sh 2025-04-23 10:13:15 +09:00
Kinan Martin
4e2a4fdcd8 readme 2025-04-16 08:13:59 +09:00
Kinan Martin
bb6d672b54 pre-commit hooks 2025-04-16 08:05:05 +09:00
Kinan Martin
e69e1c04b2 separate transcript prep stage from bpe train stage 2025-04-16 07:15:25 +09:00
Kinan Martin
6e81d9aa5b symlink copied files to librispeech recipe dir 2025-04-16 07:11:25 +09:00
Kinan Martin
0e868049a6
Merge branch 'k2-fsa:master' into mls_english_clean 2025-04-15 17:52:18 -04:00
Kinan Martin
cf8e9a8a1c cleaned-up version of recipe 2025-04-15 10:19:51 +09:00
Kinan Martin
a4be3cb3db replace file 2025-04-14 08:27:50 +09:00
Kinan Martin
1e9bb87305 change default path 2025-04-11 10:30:08 +09:00
Kinan Martin
3eeadd0f3a update prepare.sh, fix asr_datamodule.py 2025-04-11 10:29:27 +09:00
math345
64c5364085
Fix bug: When resuming training from a checkpoint, model_avg was not assigned, resulting in a None error. (#1914) 2025-04-10 11:37:28 +08:00
Fangjun Kuang
300a821f58
Fix aishell training (#1916) 2025-04-10 10:30:37 +08:00
Fangjun Kuang
171cf8c9fe
Avoid redundant computation in PiecewiseLinear. (#1915) 2025-04-09 11:52:37 +08:00
Kinan Martin
93766fc24f WIP v0 MLS English recipe 2025-04-09 10:22:20 +09:00
Wei Kang
86bd16d496
[KWS]Remove graph compiler (#1905) 2025-04-02 22:10:06 +08:00
Fangjun Kuang
db9fb8ad31
Add scripts to export streaming zipformer(v1) to RKNN (#1882) 2025-02-27 17:10:58 +08:00