997 Commits

Author SHA1 Message Date
Yifan Yang
c571a88b59
Merge branch 'k2-fsa:master' into dev/speechllm 2025-06-18 12:29:27 +08:00
Yifan Yang
34639d5249 use padding instead of trimming (suggested by @shylockasr)
use ctc compress (suggested by @shylockasr)

fix

revert

revert

revert
2025-06-18 04:25:30 +00:00
Zengwei Yao
05e3094429 refactor branch exchange in cr-ctc (#1954) 2025-06-18 04:25:15 +00:00
Wei Kang
06539d2b9d
Add Zipvoice (#1964)
* Add ZipVoice - a flow-matching based zero-shot TTS model.
2025-06-17 20:17:12 +08:00
yfyeung
7c30dd570b restrict deepspeed >=0.16.9 2025-05-28 03:42:03 +00:00
Zengwei Yao
ffb7d05635
refactor branch exchange in cr-ctc (#1954) 2025-05-27 12:09:59 +08:00
yfyeung
11ccaa3ab8 add requirements.txt 2025-05-26 04:11:28 +00:00
Yifan Yang
d1a535dc76
Merge branch 'k2-fsa:master' into dev/speechllm 2025-05-24 13:13:42 +08:00
Tianxiang Zhao
30e7ea4b5a
Fix a bug in finetune.py --use-mux (#1949) 2025-05-22 12:05:01 +08:00
Yifan Yang
24b6f42340 fix typos in docs
fix typo in RESULTS.md

Update RESULTS.md
2025-05-13 14:51:17 +08:00
yifanyeung
62dfe56cbe restore checkpoint save after validation 2025-05-13 06:14:59 +00:00
yfyeung
06667e1f6d add batch shave mechanism
fix

fix
2025-05-12 17:39:15 +00:00
Yifan Yang
e79833aad2
ensure SwooshL/SwooshR output dtype matches input dtype (#1940) 2025-05-12 19:28:48 +08:00
Yifan Yang
c709ce433d
Merge branch 'k2-fsa:master' into dev/speechllm 2025-05-12 14:38:13 +08:00
yfyeung
2793ccdf56 remove checkpoint save after validation 2025-05-12 06:36:20 +00:00
Yifan Yang
4627969ccd
fix bug: undefined name 'partial' (#1941) 2025-05-12 14:19:53 +08:00
yfyeung
c078772e59 skip OOM 2025-05-11 17:23:19 +00:00
yfyeung
9939c2b72d remove duplicated torch autocast 2025-05-11 17:03:44 +00:00
Yifan Yang
5fbeed9f96
fix SwooshR and SwooshL 2025-05-12 00:48:42 +08:00
yfyeung
cd3adad46d use quadratic-duration 2025-05-10 17:47:30 +00:00
yfyeung
c75767f600 set world_size and rank explicitly
update
2025-05-10 17:47:28 +00:00
Yifan Yang
2420d0c95f
update multi_dataset.py 2025-05-10 02:13:25 +08:00
yfyeung
ec6c8f748d fix data prepare
update
2025-05-09 17:20:38 +00:00
Yifan Yang
489c42b45e support zipformer encoder
update

update

update

update

fix

reformat

support infer

update
2025-05-08 14:44:09 +00:00
Yifan Yang
211c01bc1d format train.py
minor fix train.py
2025-05-08 04:30:02 +00:00
Yifan Yang
23b5a7ce3e format multi_dataset.py 2025-05-08 04:28:57 +00:00
Yifan Yang
9c8c4314de init zipformer_llm_zh 2025-05-07 12:18:41 +00:00
Yifan Yang
dc07bba236 init
fix
2025-04-30 09:58:33 +00:00
Yifan Yang
cd7caf12df
Fix speech_llm recipe (#1936)
* fix training/decoding scripts, cleanup unused code, and ensure compliance with style checks

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2025-04-30 11:41:00 +08:00
Fangjun Kuang
cc2e64a6aa
Fix convert_texts_into_ids() in the tedlium3 recipe. (#1929) 2025-04-24 17:04:46 +08:00
Yifan Yang
5ec95e5482
Fix SpeechLLM recipe (#1926) 2025-04-23 16:18:38 +08:00
math345
64c5364085
Fix bug: When resuming training from a checkpoint, model_avg was not assigned, resulting in a None error. (#1914) 2025-04-10 11:37:28 +08:00
Fangjun Kuang
300a821f58
Fix aishell training (#1916) 2025-04-10 10:30:37 +08:00
Fangjun Kuang
171cf8c9fe
Avoid redundant computation in PiecewiseLinear. (#1915) 2025-04-09 11:52:37 +08:00
Wei Kang
86bd16d496
[KWS]Remove graph compiler (#1905) 2025-04-02 22:10:06 +08:00
Fangjun Kuang
db9fb8ad31
Add scripts to export streaming zipformer(v1) to RKNN (#1882) 2025-02-27 17:10:58 +08:00
Yuekai Zhang
2ba665abca
Add F5-TTS with semantic token training results (#1880)
* add cosy token

* update inference code

* add extract cosy token

* update results

* add requirements.txt

* update readme

---------

Co-authored-by: yuekaiz <yuekaiz@h20-7.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@mgmt1-login.cm.cluster>
2025-02-24 13:58:47 +08:00
Machiko Bailey
da597ad782
Update RESULTS.md (#1873) 2025-02-04 09:04:25 +08:00
Machiko Bailey
0855b0338a
Merge japanese-to-english multilingual branch (#1860)
* add streaming support to reazonresearch

* update README for streaming

* Update RESULTS.md

* add onnx decode

---------

Co-authored-by: root <root@KDA03.cm.cluster>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: root <root@KDA01.cm.cluster>
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2025-02-04 01:33:09 +08:00
Yuekai Zhang
dd5d7e358b
F5-TTS Training Recipe for WenetSpeech4TTS (#1846)
* add f5

* add infer

* add dit

* add README

* update pretrained checkpoint usage

---------

Co-authored-by: yuekaiz <yuekaiz@h20-5.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@l20-3.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@h20-6.cm.cluster>
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2025-01-27 16:33:02 +08:00
zr_jin
39c466e802
Update shared (#1868) 2025-01-21 11:04:11 +08:00
zr_jin
79074ef0d4
removed the erroneous ‘’continual'' implementation (#1865) 2025-01-16 20:51:28 +08:00
Han Zhu
ab91112909
Improve infinity-check (#1862)
1. Attach the inf-check hooks if the grad scale is getting too small.
2. Add try-catch to avoid OOM in the inf-check hooks.
3. Set warmup_start=0.1 to reduce chances of divergence
2025-01-09 15:05:38 +08:00
Seonuk Kim
8d602806c3
Update conformer.py (#1859)
* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

Swich -? Swish
2025-01-06 17:31:13 +08:00
Seonuk Kim
3b6d54007b
Update conformer.py (#1857)
* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension

* Update conformer.py

feedforward dimention -> feedforward dimension
2025-01-06 13:17:02 +08:00
Fangjun Kuang
3b263539cd
Publish MatchaTTS onnx models trained with LJSpeech to huggingface (#1854) 2025-01-02 15:54:34 +08:00
Fangjun Kuang
bfffda5afb
Add MatchaTTS for the Chinese dataset Baker (#1849) 2024-12-31 17:17:05 +08:00
Yifan Yang
a2b0f6057c
Small fix (#1853) 2024-12-31 07:41:44 +08:00
Han Zhu
48088cb807
Refactor optimizer (#1837)
* Print indexes of largest grad
2024-12-30 15:30:02 +08:00
Fangjun Kuang
ad966fb81d
Minor fixes to the onnx inference script for ljspeech matcha-tts. (#1838) 2024-12-19 15:19:41 +08:00