Fangjun Kuang
34fc1fdf0d
Fix transformer decoder layer ( #1995 )
2025-07-18 20:12:29 +08:00
Bailey Machiko Hirota
5fe13078cc
Musan implementation for ReazonSpeech ( #1988 )
2025-07-18 17:16:19 +08:00
Yifan Yang
9fd0f2dc1d
support left pad for make_pad_mask ( #1990 )
2025-07-16 23:59:04 +08:00
Fangjun Kuang
e22bc78f98
Export streaming zipformer2 to RKNN ( #1977 )
2025-07-11 13:24:01 +08:00
Teo Wen Shen
da87e7fc99
add weights_only=False to torch.load ( #1984 )
2025-07-10 15:27:08 +08:00
Yifan Yang
89728dd4f8
Refactor data preparation for GigaSpeech recipe ( #1986 )
2025-07-10 11:17:37 +08:00
Mistmoon
9293edc62f
Add cr-ctc loss and ctc-decode in aishell ( #1980 )
2025-07-08 14:47:24 +08:00
Fangjun Kuang
fba5e67d5e
Fix CI tests. ( #1974 )
...
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle
deprecations in PyTorch ≥2.3.0
- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast
with the new utilities across all training and inference scripts
- Update all torch.load calls to include weights_only=False for compatibility with
newer PyTorch versions
2025-07-01 13:47:55 +08:00
Fangjun Kuang
71377d21cd
Export streaming zipformer models with whisper feature to onnx ( #1973 )
2025-06-30 19:01:15 +08:00
Fangjun Kuang
abd9437e6d
Add more wheels for piper-phonemize ( #1969 )
2025-06-24 14:49:16 +08:00
Wei Kang
e1cf4dbace
rm zipvoice ( #1967 )
2025-06-23 19:22:35 +08:00
Wei Kang
343b8fa2dc
Using non strict match in context graph for contextual words ( #1952 )
2025-06-19 12:27:15 +08:00
Wei Kang
f80a2ee110
Decrease num_buckets & remove shuffle_buffer_size ( #1955 )
2025-06-19 12:26:37 +08:00
Wei Kang
3587c4b3b7
Fix decoding byte bpes tokens to words. ( #1966 )
2025-06-19 12:26:01 +08:00
Wei Kang
762f965cf7
[zipvoice] Add requirements.txt and pinyin.txt, remove k2 from pretrained model inference. ( #1965 )
...
* Add requirements.txt and pinyin.txt needed by zipvoice
* simplify the requirements for pretrained model inference
2025-06-18 18:38:46 +08:00
Wei Kang
06539d2b9d
Add Zipvoice ( #1964 )
...
* Add ZipVoice - a flow-matching based zero-shot TTS model.
2025-06-17 20:17:12 +08:00
Zengwei Yao
ffb7d05635
refactor branch exchange in cr-ctc ( #1954 )
2025-05-27 12:09:59 +08:00
Mahsa Yarmohammadi
021e1a8846
Add acknowledgment to README ( #1950 )
2025-05-22 22:06:35 +08:00
Tianxiang Zhao
30e7ea4b5a
Fix a bug in finetune.py --use-mux ( #1949 )
2025-05-22 12:05:01 +08:00
Fangjun Kuang
fd8f8780fa
Fix logging torch.dtype. ( #1947 )
2025-05-21 12:04:57 +08:00
Yifan Yang
e79833aad2
ensure SwooshL/SwooshR output dtype matches input dtype ( #1940 )
2025-05-12 19:28:48 +08:00
Yifan Yang
4627969ccd
fix bug: undefined name 'partial' ( #1941 )
2025-05-12 14:19:53 +08:00
Yifan Yang
cd7caf12df
Fix speech_llm recipe ( #1936 )
...
* fix training/decoding scripts, cleanup unused code, and ensure compliance with style checks
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2025-04-30 11:41:00 +08:00
Fangjun Kuang
cc2e64a6aa
Fix convert_texts_into_ids() in the tedlium3 recipe. ( #1929 )
2025-04-24 17:04:46 +08:00
Yifan Yang
5ec95e5482
Fix SpeechLLM recipe ( #1926 )
2025-04-23 16:18:38 +08:00
math345
64c5364085
Fix bug: When resuming training from a checkpoint, model_avg was not assigned, resulting in a None error. ( #1914 )
2025-04-10 11:37:28 +08:00
Fangjun Kuang
300a821f58
Fix aishell training ( #1916 )
2025-04-10 10:30:37 +08:00
Fangjun Kuang
171cf8c9fe
Avoid redundant computation in PiecewiseLinear. ( #1915 )
2025-04-09 11:52:37 +08:00
Wei Kang
86bd16d496
[KWS]Remove graph compiler ( #1905 )
2025-04-02 22:10:06 +08:00
Fangjun Kuang
db9fb8ad31
Add scripts to export streaming zipformer(v1) to RKNN ( #1882 )
2025-02-27 17:10:58 +08:00
Yuekai Zhang
2ba665abca
Add F5-TTS with semantic token training results ( #1880 )
...
* add cosy token
* update inference code
* add extract cosy token
* update results
* add requirements.txt
* update readme
---------
Co-authored-by: yuekaiz <yuekaiz@h20-7.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@mgmt1-login.cm.cluster>
2025-02-24 13:58:47 +08:00
Machiko Bailey
da597ad782
Update RESULTS.md ( #1873 )
2025-02-04 09:04:25 +08:00
Machiko Bailey
0855b0338a
Merge japanese-to-english multilingual branch ( #1860 )
...
* add streaming support to reazonresearch
* update README for streaming
* Update RESULTS.md
* add onnx decode
---------
Co-authored-by: root <root@KDA03.cm.cluster>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: root <root@KDA01.cm.cluster>
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2025-02-04 01:33:09 +08:00
Yuekai Zhang
dd5d7e358b
F5-TTS Training Recipe for WenetSpeech4TTS ( #1846 )
...
* add f5
* add infer
* add dit
* add README
* update pretrained checkpoint usage
---------
Co-authored-by: yuekaiz <yuekaiz@h20-5.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@l20-3.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@h20-6.cm.cluster>
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2025-01-27 16:33:02 +08:00
zr_jin
39c466e802
Update shared ( #1868 )
2025-01-21 11:04:11 +08:00
zr_jin
79074ef0d4
removed the erroneous ‘’continual'' implementation ( #1865 )
2025-01-16 20:51:28 +08:00
zr_jin
8ab0352e60
Update style_check.yml ( #1866 )
2025-01-16 17:36:09 +08:00
Han Zhu
ab91112909
Improve infinity-check ( #1862 )
...
1. Attach the inf-check hooks if the grad scale is getting too small.
2. Add try-catch to avoid OOM in the inf-check hooks.
3. Set warmup_start=0.1 to reduce chances of divergence
2025-01-09 15:05:38 +08:00
Seonuk Kim
8d602806c3
Update conformer.py ( #1859 )
...
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
Swich -? Swish
2025-01-06 17:31:13 +08:00
Seonuk Kim
3b6d54007b
Update conformer.py ( #1857 )
...
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
* Update conformer.py
feedforward dimention -> feedforward dimension
2025-01-06 13:17:02 +08:00
Fangjun Kuang
3b263539cd
Publish MatchaTTS onnx models trained with LJSpeech to huggingface ( #1854 )
2025-01-02 15:54:34 +08:00
Fangjun Kuang
bfffda5afb
Add MatchaTTS for the Chinese dataset Baker ( #1849 )
2024-12-31 17:17:05 +08:00
Han Zhu
df46a3eaf9
Warn instead of raising exceptions in inf-check ( #1852 )
2024-12-31 16:52:06 +08:00
Yifan Yang
a2b0f6057c
Small fix ( #1853 )
2024-12-31 07:41:44 +08:00
Han Zhu
48088cb807
Refactor optimizer ( #1837 )
...
* Print indexes of largest grad
2024-12-30 15:30:02 +08:00
Han Zhu
57e9f2a8db
Add the "rms-sort" diagnostics ( #1851 )
2024-12-30 15:27:05 +08:00
Fangjun Kuang
ad966fb81d
Minor fixes to the onnx inference script for ljspeech matcha-tts. ( #1838 )
2024-12-19 15:19:41 +08:00
Fangjun Kuang
92ed1708c0
Add torch 1.13 and 2.0 to CI tests ( #1840 )
2024-12-18 16:50:14 +08:00
Fangjun Kuang
d4d4f281ec
Revert "Replace deprecated pytorch methods ( #1814 )" ( #1841 )
...
This reverts commit 3e4da5f78160d3dba3bdf97968bd7ceb8c11631f.
2024-12-18 16:49:57 +08:00
Li Peng
3e4da5f781
Replace deprecated pytorch methods ( #1814 )
...
* Replace deprecated pytorch methods
- torch.cuda.amp.GradScaler(...) => torch.amp.GradScaler("cuda", ...)
- torch.cuda.amp.autocast(...) => torch.amp.autocast("cuda", ...)
* Replace `with autocast(...)` with `with autocast("cuda", ...)`
Co-authored-by: Li Peng <lipeng@unisound.ai>
2024-12-16 10:24:16 +08:00