Commit Graph

  • a4c1db5a49 reformat Bailey Hirota 2025-09-02 10:36:51 +09:00
  • 7231cf44aa Remove changes to files outside of relevant recipes Kinan Martin 2025-08-29 16:52:13 +09:00
  • 915e8e399c Add CHiME-4 dataset, RIR and Self-Distillation jaeeunbaik 2025-08-27 16:11:20 +09:00
  • 36fc1f1d1e
    Merge pull request #4 from reazon-research/musan-mls-clean-final Kinan Martin 2025-08-22 16:28:34 +09:00
  • 572eef2bd5 Add generate_tokens function to train_bpe_model.py. Kinan Martin 2025-08-15 16:01:45 +09:00
  • 556a3f0941
    Update README.md Bailey Machiko Hirota 2025-08-14 17:02:44 +09:00
  • 8e186160d1
    Update RESULTS.md Bailey Machiko Hirota 2025-08-14 16:05:38 +09:00
  • 8c08c9c902
    Create RESULTS.md Bailey Machiko Hirota 2025-08-14 11:53:16 +09:00
  • 5400f4315d training and decoding compatibility changes Bailey Hirota 2025-08-11 15:37:49 +09:00
  • e7d72a2e5d 新增MVQ多教师多层知识蒸馏方案:添加蒸馏脚本及代码文件 dddmmys 2025-08-06 14:11:34 +08:00
  • 130c2a59c3
    Merge branch 'multi_ja_en_mls_english_clean' into musan-mls-clean-final Bailey Machiko Hirota 2025-08-06 11:45:20 +09:00
  • 4e05d70f45 fix stash commit Bailey Hirota 2025-08-06 11:08:19 +09:00
  • f9ceead59e Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • dee07dec3a Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • ee2a6d60e0 remove bilingual tag from train.py Bailey Hirota 2025-05-14 08:37:44 +09:00
  • f210002500 Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • 0967f5f7d1 Manually fix merge conflict in multi_ja_en/ASR/zipformer/train.py Kinan Martin 2025-07-28 17:59:47 +09:00
  • 636121c507 remove bilingual tag from train.py Bailey Hirota 2025-05-14 08:37:44 +09:00
  • ed79fa3c04 revert unrelated transformer.py diffs from rebase Bailey Hirota 2025-08-05 21:44:26 +09:00
  • c23af2ea1a musan implementation for mls_english Bailey Hirota 2025-08-05 17:15:37 +09:00
  • f15a783896 Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • 2d8e3fd858 Fix transformer decoder layer (#1995) Fangjun Kuang 2025-07-18 20:12:29 +08:00
  • 11df2a83fc Musan implementation for ReazonSpeech (#1988) Bailey Machiko Hirota 2025-07-18 18:16:19 +09:00
  • 0ca7595d25 Update RESULTS.md Bailey Machiko Hirota 2025-07-18 18:04:12 +09:00
  • 94cf8c3afb support left pad for make_pad_mask (#1990) Yifan Yang 2025-07-16 23:59:04 +08:00
  • 8dd2c0f21b PR review suggestions implemented Bailey Hirota 2025-07-17 02:01:03 +09:00
  • 7b4abbaaac black and isort formatting Bailey Hirota 2025-07-16 19:53:47 +09:00
  • 2f1f419149 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:18:34 +09:00
  • b19929c302 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:18:12 +09:00
  • 865b859e5d Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:18:03 +09:00
  • 95f58e69fd Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:17:46 +09:00
  • 60f326bf63 working changes for musan mixing Bailey Hirota 2025-07-15 13:47:59 +09:00
  • a310d8fd5b attempt to fix musan paths Bailey Hirota 2025-07-14 18:33:23 +09:00
  • 44758153df update musan symlinks Bailey Hirota 2025-07-11 11:00:09 +09:00
  • aeffb15dab update musan paths Bailey Hirota 2025-07-10 15:32:03 +09:00
  • 6272827db3 update musan path Bailey Hirota 2025-07-10 13:23:30 +09:00
  • c610c6d56a resolve typos and import issues Bailey Hirota 2025-07-09 14:06:29 +09:00
  • 1cf544b513 remove comment Bailey Hirota 2025-07-04 15:40:14 +09:00
  • 5fb4bdf9e7 commenting Bailey Hirota 2025-07-01 21:21:25 +09:00
  • ed2c0a4597 typos Bailey Hirota 2025-07-01 21:04:18 +09:00
  • 199650781f changes to asr_datamodule for musan support Bailey Hirota 2025-07-01 18:18:25 +09:00
  • d7ee48e879 Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • 694ecb907a make prepare.sh symlinks relative Kinan Martin 2025-07-08 11:16:18 +09:00
  • 9c91775a51 remove unused local scripts Bailey Hirota 2025-06-13 00:49:40 +09:00
  • ac94174215 changes to train script - no need for limiting utterance length here Bailey Hirota 2025-06-13 00:48:37 +09:00
  • 76bae70132 remove commented out codels Bailey Hirota 2025-06-13 00:33:47 +09:00
  • 606789b8f4 add stage 6 - update cutset paths to prepare Bailey Hirota 2025-06-12 00:21:52 +09:00
  • 1ddd3cdcf8 update manifest dir path Bailey Hirota 2025-06-12 00:20:41 +09:00
  • 0a4ed5e636 add step 4: display manifest stats to mls_eng Bailey Hirota 2025-06-11 18:06:08 +09:00
  • 065ca315c8 Update README.md to reflect MLS English dataset Kinan Martin 2025-06-11 09:19:07 +09:00
  • 9c318da803 Add failsafe for MLS English dev set key alternate name as validation Kinan Martin 2025-06-11 09:18:28 +09:00
  • b6d43a40ac Parametrize dev and test split sizes. Kinan Martin 2025-06-10 10:11:33 +09:00
  • d136086d6b add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech Kinan Martin 2025-06-06 11:44:27 +09:00
  • b25254f0c9 add utility file for updating the storage_path of cutsets for use in the multilingual training recipe directory structure Kinan Martin 2025-06-06 11:42:08 +09:00
  • 68bff93940 fix decode script data module usage Kinan Martin 2025-06-06 11:29:29 +09:00
  • 1b1a317603 Combined updates. Changed BBPE path structure, changed dataset path structure, added script to update cutset paths. WIP Kinan Martin 2025-06-04 10:12:39 +09:00
  • 1093e78612 use huggingface_hub library to download mls_english Kinan Martin 2025-05-22 09:15:12 +09:00
  • 5682978c64 switch mls_english clone from https to ssh Kinan Martin 2025-05-21 10:25:47 +09:00
  • 2265e1afed fix stage 5 output pathing Kinan Martin 2025-05-15 09:11:40 +09:00
  • 7bea23e954 restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5 Kinan Martin 2025-05-15 07:24:26 +09:00
  • 8b035a0c96 remove bilingual tag from train.py Bailey Hirota 2025-05-14 08:37:44 +09:00
  • 99db0e4643 deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet Kinan Martin 2025-05-14 08:40:15 +09:00
  • 31a37c7e44 Revert "add fbank" Bailey Hirota 2025-05-02 23:18:53 +09:00
  • 7d462aa8b4 add fbank Bailey Hirota 2025-05-02 03:31:55 +09:00
  • 06e429131b new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English Kinan Martin 2025-05-09 10:57:41 +09:00
  • 0e86ef805c optimize with num_jobs on save_audios Kinan Martin 2025-05-02 07:22:38 +09:00
  • 73dea24fd9 fix stage 2 and 3 Kinan Martin 2025-05-01 08:15:07 +09:00
  • 2504b23861 fix validation manifest name Kinan Martin 2025-05-01 08:05:42 +09:00
  • eb2168bc49 adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files Kinan Martin 2025-04-30 10:06:13 +09:00
  • a8f45bc08b move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech Kinan Martin 2025-04-24 09:39:54 +09:00
  • fe88d1db36 instead of on-the-fly features, precompute fbank and manifests in prepare.sh Kinan Martin 2025-04-23 10:13:15 +09:00
  • 996334f520 readme Kinan Martin 2025-04-16 08:13:59 +09:00
  • 24db8c11ba pre-commit hooks Kinan Martin 2025-04-16 08:05:05 +09:00
  • c532a503e7 separate transcript prep stage from bpe train stage Kinan Martin 2025-04-16 07:15:25 +09:00
  • 313afea773 symlink copied files to librispeech recipe dir Kinan Martin 2025-04-16 07:10:39 +09:00
  • e76b749450 cleaned-up version of recipe Kinan Martin 2025-04-15 10:19:51 +09:00
  • 1b8a3061b0 replace file Kinan Martin 2025-04-14 08:27:50 +09:00
  • 0ab027411f change default path Kinan Martin 2025-04-11 10:30:08 +09:00
  • ba6d8e8b26 update prepare.sh, fix asr_datamodule.py Kinan Martin 2025-04-11 10:29:27 +09:00
  • c92c606c5f WIP v0 MLS English recipe Kinan Martin 2025-04-09 10:22:20 +09:00
  • 1c5d792bf1 Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • aeba8b505c Validate generated manifest files. (#338) Fangjun Kuang 2022-05-03 07:08:33 +08:00
  • c053b7c8f0 adding farsi cv preprocessing Mohammad Gholizadeh 2025-08-02 09:27:18 +01:00
  • dbd89773d5 Manually fix merge conflict in multi_ja_en/ASR/zipformer/train.py Kinan Martin 2025-07-28 17:59:47 +09:00
  • aed139f125 Musan implementation for ReazonSpeech (#1988) Bailey Machiko Hirota 2025-07-18 18:16:19 +09:00
  • 9d93d63cf2 Update RESULTS.md Bailey Machiko Hirota 2025-07-18 18:04:12 +09:00
  • dc4db379ea PR review suggestions implemented Bailey Hirota 2025-07-17 02:01:03 +09:00
  • 6012edbc17 black and isort formatting Bailey Hirota 2025-07-16 19:53:47 +09:00
  • 154ef43206 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:18:34 +09:00
  • f7fec4a6e7 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:18:12 +09:00
  • 542620c4e3 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:18:03 +09:00
  • 310aaec3cc Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py Bailey Machiko Hirota 2025-07-16 17:17:46 +09:00
  • aee7b87adb working changes for musan mixing Bailey Hirota 2025-07-15 13:47:59 +09:00
  • d5cc0301d4 attempt to fix musan paths Bailey Hirota 2025-07-14 18:33:23 +09:00
  • 0f700ed0b2 update musan symlinks Bailey Hirota 2025-07-11 11:00:09 +09:00
  • 093a035935 update musan paths Bailey Hirota 2025-07-10 15:32:03 +09:00
  • 4e92879751 update musan path Bailey Hirota 2025-07-10 13:23:30 +09:00
  • f51621b374 resolve typos and import issues Bailey Hirota 2025-07-09 14:06:29 +09:00
  • de35cc2760 remove comment Bailey Hirota 2025-07-04 15:40:14 +09:00
  • 5ec9389909 commenting Bailey Hirota 2025-07-01 21:21:25 +09:00