Commit Graph

  • 5ec7297f32 add dataset example for librispeech Fangjun Kuang 2025-05-29 11:44:40 +08:00
  • 717aa53be9 Copy files Fangjun Kuang 2025-05-29 11:06:55 +08:00
  • 7c30dd570b restrict deepspeed >=0.16.9 yfyeung 2025-05-28 03:42:03 +00:00
  • 48026bd62b Decrease num_buckets & remove shuffle_buffer_size pkufool 2025-05-28 10:46:37 +08:00
  • 49256fa917 fix tts stage decode root 2025-05-28 02:34:07 +00:00
  • 5a7c72cb47 add tts task decode root 2025-05-27 02:12:22 -07:00
  • 1281d7a515 add tts training root 2025-05-27 00:18:23 -07:00
  • ffb7d05635
    refactor branch exchange in cr-ctc (#1954) Zengwei Yao 2025-05-27 12:09:59 +08:00
  • 4fb970b2ca refactor branch exchange in cr-ctc yaozengwei 2025-05-27 12:04:38 +08:00
  • 39700d5c94 refactor train to reuse code root 2025-05-26 19:53:16 -07:00
  • 11ccaa3ab8 add requirements.txt yfyeung 2025-05-26 04:11:28 +00:00
  • d1a535dc76
    Merge branch 'k2-fsa:master' into dev/speechllm Yifan Yang 2025-05-24 13:13:42 +08:00
  • a6aaf33843 Using non strict match in context graph for contextual words pkufool 2025-05-23 17:43:30 +08:00
  • e6e1f3fa4f add tts stage root 2025-05-23 01:53:05 -07:00
  • dd858f0cd1 support instruct s2s root 2025-05-22 23:16:33 -07:00
  • 9fff18edec refactor code root 2025-05-22 19:14:52 -07:00
  • 021e1a8846
    Add acknowledgment to README (#1950) Mahsa Yarmohammadi 2025-05-22 10:06:35 -04:00
  • 452a993ab2 Add acknowledgment to README Mahsa Yarmohammadi 2025-05-22 08:51:03 -04:00
  • 7a12d88d6c update root 2025-05-21 22:18:57 -07:00
  • 7aa6c80ddb add multi gpu processing root 2025-05-21 21:54:59 -07:00
  • 30e7ea4b5a
    Fix a bug in finetune.py --use-mux (#1949) Tianxiang Zhao 2025-05-22 12:05:01 +08:00
  • b05b2604e9 Fix a bug in finetune.py --use-mux Redemption 2025-05-22 11:58:58 +08:00
  • 1f11ba4d28 use huggingface_hub library to download mls_english Kinan Martin 2025-05-22 09:15:12 +09:00
  • fd8f8780fa
    Fix logging torch.dtype. (#1947) Fangjun Kuang 2025-05-21 12:04:57 +08:00
  • 2ca458bb56 Fix logging torch.dtype. Fangjun Kuang 2025-05-21 12:01:38 +08:00
  • f3f04fa626 switch mls_english clone from https to ssh Kinan Martin 2025-05-21 10:25:47 +09:00
  • ca84aff5d6 remove cosyvoice lib root 2025-05-20 00:52:09 -07:00
  • 9cdd393f43 add server url root 2025-05-20 07:48:49 +00:00
  • 50fc1aba60 add multi-node root 2025-05-18 18:47:22 -07:00
  • 4a29430349 add loss type root 2025-05-19 01:31:21 +00:00
  • e52581e69b support local_rank for multi-node root 2025-05-16 00:02:12 -07:00
  • 0e8c1db4d0 fix speed perturb issue root 2025-05-15 22:45:04 -07:00
  • bfb4ebeb83 remove triton root 2025-05-15 14:32:49 +00:00
  • f81363d324 add speech continuation pretraining root 2025-05-15 14:16:51 +00:00
  • e6615df4eb fix stage 5 output pathing Kinan Martin 2025-05-15 09:11:40 +09:00
  • daff070d68 restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5 Kinan Martin 2025-05-15 07:24:26 +09:00
  • e34f2dbb2a merge change to remove bilingual param with new multidataset_datamodule Kinan Martin 2025-05-14 08:51:11 +09:00
  • eb5004880f deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet Kinan Martin 2025-05-14 08:40:15 +09:00
  • 7ef1811063 remove bilingual tag from train.py Bailey Hirota 2025-05-14 08:37:44 +09:00
  • e65725810c fix mmsu root 2025-05-13 09:13:12 +00:00
  • 24b6f42340 fix typos in docs Yifan Yang 2025-05-13 14:34:05 +08:00
  • 62dfe56cbe restore checkpoint save after validation yifanyeung 2025-05-13 06:14:59 +00:00
  • cbf3af31fd add voicebench eval root 2025-05-13 05:37:11 +00:00
  • b2df5bbb83 Revert "add fbank" Bailey Hirota 2025-05-02 23:18:53 +09:00
  • 82bd37cacd add fbank Bailey Hirota 2025-05-02 03:31:55 +09:00
  • 06667e1f6d add batch shave mechanism yfyeung 2025-05-12 16:49:42 +00:00
  • ea20ac208d
    Merge branch 'k2-fsa:master' into dev/speechllm Yifan Yang 2025-05-12 20:31:41 +08:00
  • 407828f22e Support multiple parallel augmentations. Fangjun Kuang 2025-05-12 19:35:37 +08:00
  • e79833aad2
    ensure SwooshL/SwooshR output dtype matches input dtype (#1940) Yifan Yang 2025-05-12 19:28:48 +08:00
  • 89781b9bb1 add cosyvoice2 decode root 2025-05-12 10:06:59 +00:00
  • c709ce433d
    Merge branch 'k2-fsa:master' into dev/speechllm Yifan Yang 2025-05-12 14:38:13 +08:00
  • 2793ccdf56 remove checkpoint save after validation yfyeung 2025-05-12 06:36:20 +00:00
  • 9a0b5d706b
    Merge branch 'k2-fsa:master' into fix/swoosh Yifan Yang 2025-05-12 14:23:19 +08:00
  • 4627969ccd
    fix bug: undefined name 'partial' (#1941) Yifan Yang 2025-05-12 14:19:53 +08:00
  • d3f9603e58
    fix bug: undefined name 'partial' Yifan Yang 2025-05-12 13:16:49 +08:00
  • e99e621f65
    ensure SwooshL/SwooshR output dtype matches input dtype Yifan Yang 2025-05-12 12:58:57 +08:00
  • c078772e59 skip OOM yfyeung 2025-05-11 17:23:19 +00:00
  • 9939c2b72d remove duplicated torch autocast yfyeung 2025-05-11 17:03:44 +00:00
  • 5fbeed9f96
    fix SwooshR and SwooshL Yifan Yang 2025-05-12 00:48:42 +08:00
  • cd3adad46d use quadratic-duration yfyeung 2025-05-10 17:47:05 +00:00
  • c75767f600 set world_size and rank explicitly yfyeung 2025-05-10 17:32:14 +00:00
  • fd31ed5b0b
    minor fix ssl_datamodule.py Yifan Yang 2025-05-11 00:59:06 +08:00
  • 260d37b65a
    Merge branch 'k2-fsa:master' into dev/k2ssl Yifan Yang 2025-05-11 00:33:35 +08:00
  • 2420d0c95f
    update multi_dataset.py Yifan Yang 2025-05-10 02:13:25 +08:00
  • ec6c8f748d fix data prepare yfyeung 2025-05-09 17:18:22 +00:00
  • b20a0d0e35 add on the fly feature root 2025-05-08 19:21:41 -07:00
  • 21d1bf73bb new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English Kinan Martin 2025-05-09 10:57:41 +09:00
  • 489c42b45e support zipformer encoder Yifan Yang 2025-05-08 04:31:34 +00:00
  • bd2df570ad add debug script root 2025-05-08 03:37:26 -07:00
  • 37db65984c remove k2 dependency root 2025-05-08 03:02:34 -07:00
  • e41c1cabd5 add dependency root 2025-05-08 07:56:14 +00:00
  • 7cc366d82d add en data, cosy2 token for training root 2025-05-08 07:23:22 +00:00
  • 2dd40b62ef add vocalnet en data root 2025-05-08 06:29:46 +00:00
  • 211c01bc1d format train.py Yifan Yang 2025-05-07 12:37:19 +00:00
  • 23b5a7ce3e format multi_dataset.py Yifan Yang 2025-05-07 12:29:12 +00:00
  • 9c8c4314de init zipformer_llm_zh Yifan Yang 2025-05-07 12:18:41 +00:00
  • 547f5c5cfb optimize with num_jobs on save_audios Kinan Martin 2025-05-02 07:22:38 +09:00
  • 88249f0eb4 fix stage 2 and 3 Kinan Martin 2025-05-01 08:15:07 +09:00
  • 90326c1f43 fix validation manifest name Kinan Martin 2025-05-01 08:05:42 +09:00
  • dc07bba236 init Yifan Yang 2025-04-30 09:54:42 +00:00
  • cd7caf12df
    Fix speech_llm recipe (#1936) Yifan Yang 2025-04-30 11:41:00 +08:00
  • b21f3aace5 Merge branch 'fix/speechllm' of github.com:yfyeung/icefall into fix/speechllm Your Name 2025-04-29 20:34:22 -07:00
  • 1210025d8b fix flake8 Your Name 2025-04-29 20:34:12 -07:00
  • dbe270ba94 adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files Kinan Martin 2025-04-30 10:06:13 +09:00
  • bef49cca86
    Merge branch 'k2-fsa:master' into fix/speechllm Yifan Yang 2025-04-30 01:44:58 +08:00
  • 973fcbc585 add missing backslash yfyeung 2025-04-29 10:39:29 -07:00
  • 26c022665b remove space if existing yfyeung 2025-04-29 10:32:12 -07:00
  • d1c336f589 remove ineffective normalize_text_alimeeting yfyeung 2025-04-29 10:28:01 -07:00
  • 26aef4a926 remove ineffective normalize_text_alimeeting yfyeung 2025-04-29 06:24:43 -07:00
  • 46b9be31cc remove batch shaving yfyeung 2025-04-28 04:37:57 -07:00
  • f5d2aa1f5d fix train/eval mode yfyeung 2025-04-28 09:10:12 +00:00
  • 59c577f4ef Fix convert_texts_into_ids() in the tedlium3 recipe. (#1929) Fangjun Kuang 2025-04-24 17:04:46 +08:00
  • 08be51a91f change pic root 2025-04-29 10:09:57 +00:00
  • 11bd3c9ad8 lint root 2025-04-29 09:46:44 +00:00
  • 360f0aa397 update README root 2025-04-29 08:49:12 +00:00
  • 448a4eeea7 update hf dataset loading into lhotse root 2025-04-29 07:33:34 +00:00
  • d742043e75 refactor decode part Yuekai Zhang 2025-04-25 18:31:43 +08:00
  • 71a0a442a6 add history cache root 2025-04-25 10:05:07 +00:00
  • 47920c2336 add gradio demo Yuekai Zhang 2025-04-25 16:05:37 +08:00
  • 72addd40f5 change place Yuekai Zhang 2025-04-25 14:22:16 +08:00