Commit Graph

  • fba5e67d5e
    Fix CI tests. (#1974) Fangjun Kuang 2025-07-01 13:47:55 +08:00
  • 127b4985d3 small fixes k2-fsa 2025-07-01 13:41:56 +08:00
  • b4e9edbed1 minor fixes k2-fsa 2025-07-01 13:31:03 +08:00
  • 82af46284f Merge branch 'fix-ci-2' into fix-ci k2-fsa 2025-07-01 11:41:26 +08:00
  • 633eec5445 small fixes k2-fsa 2025-07-01 11:14:48 +08:00
  • a91d890552 fix grad scaler k2-fsa 2025-07-01 00:05:08 +08:00
  • a1277c9ae9 fix grad scaler k2-fsa 2025-07-01 00:05:08 +08:00
  • f186e1d427 Fix weights_only=False k2-fsa 2025-06-30 22:07:36 +08:00
  • a53c323750 Fix CI warnings k2-fsa 2025-06-30 21:46:18 +08:00
  • ffe2f16b1d Fix librispeech CI test errors k2-fsa 2025-06-30 20:36:21 +08:00
  • fe36fcc25c Refactor CI k2-fsa 2025-06-30 19:04:02 +08:00
  • 71377d21cd
    Export streaming zipformer models with whisper feature to onnx (#1973) Fangjun Kuang 2025-06-30 19:01:15 +08:00
  • a318ac20c3 export fp16 onnx models Fangjun Kuang 2025-06-30 11:34:50 +08:00
  • 9d4b0dfcd4 Export multi_zh-hans models to onnx Fangjun Kuang 2025-06-30 10:58:18 +08:00
  • abd9437e6d
    Add more wheels for piper-phonemize (#1969) Fangjun Kuang 2025-06-24 14:49:16 +08:00
  • a879de95a3 deploy: 93940904d8aa837ef9d4def90f481462976dcbd1 csukuangfj 2025-06-24 06:33:27 +00:00
  • 93940904d8 fix windows Fangjun Kuang 2025-06-24 14:32:12 +08:00
  • 896009611f deploy: 5fe1fad4ec53967f4ca339d9b90d7bb18280f93d csukuangfj 2025-06-24 03:18:26 +00:00
  • 5fe1fad4ec update ci Fangjun Kuang 2025-06-24 11:17:43 +08:00
  • bbbf798375 update piper-phonemize wheels Fangjun Kuang 2025-06-24 11:15:38 +08:00
  • e1cf4dbace
    rm zipvoice (#1967) Wei Kang 2025-06-23 19:22:35 +08:00
  • 0c9bd934c2 rm zipvoice pkufool 2025-06-23 19:16:33 +08:00
  • 343b8fa2dc
    Using non strict match in context graph for contextual words (#1952) Wei Kang 2025-06-19 12:27:15 +08:00
  • f80a2ee110
    Decrease num_buckets & remove shuffle_buffer_size (#1955) Wei Kang 2025-06-19 12:26:37 +08:00
  • 3587c4b3b7
    Fix decoding byte bpes tokens to words. (#1966) Wei Kang 2025-06-19 12:26:01 +08:00
  • 2e1a1af049
    ignore decode errors. Wei Kang 2025-06-19 11:30:21 +08:00
  • ba5ffc711e
    Minor fix. Wei Kang 2025-06-19 11:25:48 +08:00
  • 857507795d
    Fix deocding byte bpes tokens to words. Wei Kang 2025-06-19 11:17:38 +08:00
  • 56349001d6
    Merge branch 'k2-fsa:master' into dev/speechllm Yifan Yang 2025-06-18 21:09:44 +08:00
  • 762f965cf7
    [zipvoice] Add requirements.txt and pinyin.txt, remove k2 from pretrained model inference. (#1965) Wei Kang 2025-06-18 18:38:46 +08:00
  • 53111d0e46 fix for multigpu yfyeung 2025-06-18 07:33:15 +00:00
  • dae64dd08d simplify the requirements for pretrained model inference pkufool 2025-06-18 13:51:40 +08:00
  • c197be2c05 simplify the requirements for pretrained model inference pkufool 2025-06-18 13:50:39 +08:00
  • 39d90356fe fix deepspeed config yfyeung 2025-06-18 04:44:10 +00:00
  • c571a88b59
    Merge branch 'k2-fsa:master' into dev/speechllm Yifan Yang 2025-06-18 12:29:27 +08:00
  • 34639d5249 use padding instead of trimming (suggested by @shylockasr) Yifan Yang 2025-06-03 21:45:47 +08:00
  • 05e3094429 refactor branch exchange in cr-ctc (#1954) Zengwei Yao 2025-05-27 12:09:59 +08:00
  • d23bacc23b fix isort pkufool 2025-06-18 12:07:46 +08:00
  • 88c35c5e29 fix flake8 pkufool 2025-06-18 12:00:05 +08:00
  • df382566dc Add requirements.txt and pinyin.txt needed by zipvoice pkufool 2025-06-18 11:49:56 +08:00
  • 06539d2b9d
    Add Zipvoice (#1964) Wei Kang 2025-06-17 20:17:12 +08:00
  • e45da09009 Minor fixes pkufool 2025-06-17 20:02:05 +08:00
  • dc731ea089 minor fixes pkufool 2025-06-17 19:48:38 +08:00
  • 2376ed2117 add emilia data preparation pipeline pkufool 2025-06-17 19:38:46 +08:00
  • 60572c2444 Minor fixes to infer pretrained model pkufool 2025-06-17 16:02:20 +08:00
  • 8c529ebe90
    Merge pull request #3 from zhu-han/zipvoice Wei Kang 2025-06-17 10:29:42 +08:00
  • ecfc36ba9e Update the paper link Han Zhu 2025-06-17 10:03:25 +08:00
  • 9936d726d2 Add ZipVoice Han Zhu 2025-06-16 09:45:34 +08:00
  • 252e5eb2e1 remove unused local scripts Bailey Hirota 2025-06-13 00:49:40 +09:00
  • fe9f975ec2 changes to train script - no need for limiting utterance length here Bailey Hirota 2025-06-13 00:48:37 +09:00
  • e1f140a50e remove commented out codels Bailey Hirota 2025-06-13 00:33:47 +09:00
  • 78d4e50d0f add stage 6 - update cutset paths to prepare Bailey Hirota 2025-06-12 00:21:52 +09:00
  • da75835639 update manifest dir path Bailey Hirota 2025-06-12 00:20:41 +09:00
  • 5a120cbcb3 add step 4: display manifest stats to mls_eng Bailey Hirota 2025-06-11 18:06:08 +09:00
  • 003e94fac2 Update README.md to reflect MLS English dataset Kinan Martin 2025-06-11 09:19:07 +09:00
  • c7c74b8658 Add failsafe for MLS English dev set key alternate name as validation Kinan Martin 2025-06-11 09:18:28 +09:00
  • c8d932b0c2 Parametrize dev and test split sizes. Kinan Martin 2025-06-10 10:11:33 +09:00
  • a6f60de9dd add utility file for creating subsets of mls english. must be fixed to make dev and test splits have matching sizes to reazonspeech Kinan Martin 2025-06-06 11:44:27 +09:00
  • 052fcc3218 add utility file for updating the storage_path of cutsets for use in the multilingual training recipe directory structure Kinan Martin 2025-06-06 11:42:08 +09:00
  • 6255ba5cb2 fix decode script data module usage Kinan Martin 2025-06-06 11:29:29 +09:00
  • 559f9e2def fix repeat bos and pad id root 2025-06-04 10:02:42 +00:00
  • ce894a7ba2 Combined updates. Changed BBPE path structure, changed dataset path structure, added script to update cutset paths. WIP Kinan Martin 2025-06-04 10:12:39 +09:00
  • 80677a55f8 remove stats root 2025-06-03 00:48:39 -07:00
  • 5becf6927d remove concat three items root 2025-06-03 00:18:21 -07:00
  • 4c0396f8f2 support text2speech ultrachat root 2025-06-02 23:16:03 -07:00
  • 0f88a3a6c3 First working example Fangjun Kuang 2025-05-30 15:42:31 +08:00
  • 516696f3e4 Merge remote-tracking branch 'dan/master' into dataset-parallel-augmentation-example Fangjun Kuang 2025-05-29 17:04:50 +08:00
  • 3b52e0cb9e minor fixes Fangjun Kuang 2025-05-29 12:11:56 +08:00
  • dc74705d20 remove cr-loss Fangjun Kuang 2025-05-29 11:49:30 +08:00
  • 9b95c72d19 copy files Fangjun Kuang 2025-05-29 11:45:17 +08:00
  • 5ec7297f32 add dataset example for librispeech Fangjun Kuang 2025-05-29 11:44:40 +08:00
  • 717aa53be9 Copy files Fangjun Kuang 2025-05-29 11:06:55 +08:00
  • 7c30dd570b restrict deepspeed >=0.16.9 yfyeung 2025-05-28 03:42:03 +00:00
  • 48026bd62b Decrease num_buckets & remove shuffle_buffer_size pkufool 2025-05-28 10:46:37 +08:00
  • 49256fa917 fix tts stage decode root 2025-05-28 02:34:07 +00:00
  • 5a7c72cb47 add tts task decode root 2025-05-27 02:12:22 -07:00
  • 1281d7a515 add tts training root 2025-05-27 00:18:23 -07:00
  • ffb7d05635
    refactor branch exchange in cr-ctc (#1954) Zengwei Yao 2025-05-27 12:09:59 +08:00
  • 4fb970b2ca refactor branch exchange in cr-ctc yaozengwei 2025-05-27 12:04:38 +08:00
  • 39700d5c94 refactor train to reuse code root 2025-05-26 19:53:16 -07:00
  • 11ccaa3ab8 add requirements.txt yfyeung 2025-05-26 04:11:28 +00:00
  • d1a535dc76
    Merge branch 'k2-fsa:master' into dev/speechllm Yifan Yang 2025-05-24 13:13:42 +08:00
  • a6aaf33843 Using non strict match in context graph for contextual words pkufool 2025-05-23 17:43:30 +08:00
  • e6e1f3fa4f add tts stage root 2025-05-23 01:53:05 -07:00
  • dd858f0cd1 support instruct s2s root 2025-05-22 23:16:33 -07:00
  • 9fff18edec refactor code root 2025-05-22 19:14:52 -07:00
  • 021e1a8846
    Add acknowledgment to README (#1950) Mahsa Yarmohammadi 2025-05-22 10:06:35 -04:00
  • 452a993ab2 Add acknowledgment to README Mahsa Yarmohammadi 2025-05-22 08:51:03 -04:00
  • 7a12d88d6c update root 2025-05-21 22:18:57 -07:00
  • 7aa6c80ddb add multi gpu processing root 2025-05-21 21:54:59 -07:00
  • 30e7ea4b5a
    Fix a bug in finetune.py --use-mux (#1949) Tianxiang Zhao 2025-05-22 12:05:01 +08:00
  • b05b2604e9 Fix a bug in finetune.py --use-mux Redemption 2025-05-22 11:58:58 +08:00
  • 1f11ba4d28 use huggingface_hub library to download mls_english Kinan Martin 2025-05-22 09:15:12 +09:00
  • fd8f8780fa
    Fix logging torch.dtype. (#1947) Fangjun Kuang 2025-05-21 12:04:57 +08:00
  • 2ca458bb56 Fix logging torch.dtype. Fangjun Kuang 2025-05-21 12:01:38 +08:00
  • f3f04fa626 switch mls_english clone from https to ssh Kinan Martin 2025-05-21 10:25:47 +09:00
  • ca84aff5d6 remove cosyvoice lib root 2025-05-20 00:52:09 -07:00
  • 9cdd393f43 add server url root 2025-05-20 07:48:49 +00:00
  • 50fc1aba60 add multi-node root 2025-05-18 18:47:22 -07:00
  • 4a29430349 add loss type root 2025-05-19 01:31:21 +00:00