Yifan Yang
70f13e54d8
Merge branch 'k2-fsa:master' into dev/speechllm
2025-07-07 11:32:12 +08:00
Fangjun Kuang
fba5e67d5e
Fix CI tests. ( #1974 )
...
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle
deprecations in PyTorch ≥2.3.0
- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast
with the new utilities across all training and inference scripts
- Update all torch.load calls to include weights_only=False for compatibility with
newer PyTorch versions
2025-07-01 13:47:55 +08:00
Fangjun Kuang
71377d21cd
Export streaming zipformer models with whisper feature to onnx ( #1973 )
2025-06-30 19:01:15 +08:00
Fangjun Kuang
abd9437e6d
Add more wheels for piper-phonemize ( #1969 )
2025-06-24 14:49:16 +08:00
Wei Kang
e1cf4dbace
rm zipvoice ( #1967 )
2025-06-23 19:22:35 +08:00
Wei Kang
343b8fa2dc
Using non strict match in context graph for contextual words ( #1952 )
2025-06-19 12:27:15 +08:00
Wei Kang
f80a2ee110
Decrease num_buckets & remove shuffle_buffer_size ( #1955 )
2025-06-19 12:26:37 +08:00
Wei Kang
3587c4b3b7
Fix decoding byte bpes tokens to words. ( #1966 )
2025-06-19 12:26:01 +08:00
Yifan Yang
56349001d6
Merge branch 'k2-fsa:master' into dev/speechllm
2025-06-18 21:09:44 +08:00
Wei Kang
762f965cf7
[zipvoice] Add requirements.txt and pinyin.txt, remove k2 from pretrained model inference. ( #1965 )
...
* Add requirements.txt and pinyin.txt needed by zipvoice
* simplify the requirements for pretrained model inference
2025-06-18 18:38:46 +08:00
yfyeung
53111d0e46
fix for multigpu
2025-06-18 07:33:15 +00:00
yfyeung
39d90356fe
fix deepspeed config
...
fix
2025-06-18 05:04:00 +00:00
Yifan Yang
c571a88b59
Merge branch 'k2-fsa:master' into dev/speechllm
2025-06-18 12:29:27 +08:00
Yifan Yang
34639d5249
use padding instead of trimming (suggested by @shylockasr)
...
use ctc compress (suggested by @shylockasr)
fix
revert
revert
revert
2025-06-18 04:25:30 +00:00
Zengwei Yao
05e3094429
refactor branch exchange in cr-ctc ( #1954 )
2025-06-18 04:25:15 +00:00
Wei Kang
06539d2b9d
Add Zipvoice ( #1964 )
...
* Add ZipVoice - a flow-matching based zero-shot TTS model.
2025-06-17 20:17:12 +08:00
yfyeung
7c30dd570b
restrict deepspeed >=0.16.9
2025-05-28 03:42:03 +00:00
Zengwei Yao
ffb7d05635
refactor branch exchange in cr-ctc ( #1954 )
2025-05-27 12:09:59 +08:00
yfyeung
11ccaa3ab8
add requirements.txt
2025-05-26 04:11:28 +00:00
Yifan Yang
d1a535dc76
Merge branch 'k2-fsa:master' into dev/speechllm
2025-05-24 13:13:42 +08:00
Mahsa Yarmohammadi
021e1a8846
Add acknowledgment to README ( #1950 )
2025-05-22 22:06:35 +08:00
Tianxiang Zhao
30e7ea4b5a
Fix a bug in finetune.py --use-mux ( #1949 )
2025-05-22 12:05:01 +08:00
Fangjun Kuang
fd8f8780fa
Fix logging torch.dtype. ( #1947 )
2025-05-21 12:04:57 +08:00
Yifan Yang
24b6f42340
fix typos in docs
...
fix typo in RESULTS.md
Update RESULTS.md
2025-05-13 14:51:17 +08:00
yifanyeung
62dfe56cbe
restore checkpoint save after validation
2025-05-13 06:14:59 +00:00
yfyeung
06667e1f6d
add batch shave mechanism
...
fix
fix
2025-05-12 17:39:15 +00:00
Yifan Yang
ea20ac208d
Merge branch 'k2-fsa:master' into dev/speechllm
2025-05-12 20:31:41 +08:00
Yifan Yang
e79833aad2
ensure SwooshL/SwooshR output dtype matches input dtype ( #1940 )
2025-05-12 19:28:48 +08:00
Yifan Yang
c709ce433d
Merge branch 'k2-fsa:master' into dev/speechllm
2025-05-12 14:38:13 +08:00
yfyeung
2793ccdf56
remove checkpoint save after validation
2025-05-12 06:36:20 +00:00
Yifan Yang
4627969ccd
fix bug: undefined name 'partial' ( #1941 )
2025-05-12 14:19:53 +08:00
yfyeung
c078772e59
skip OOM
2025-05-11 17:23:19 +00:00
yfyeung
9939c2b72d
remove duplicated torch autocast
2025-05-11 17:03:44 +00:00
Yifan Yang
5fbeed9f96
fix SwooshR and SwooshL
2025-05-12 00:48:42 +08:00
yfyeung
cd3adad46d
use quadratic-duration
2025-05-10 17:47:30 +00:00
yfyeung
c75767f600
set world_size and rank explicitly
...
update
2025-05-10 17:47:28 +00:00
Yifan Yang
2420d0c95f
update multi_dataset.py
2025-05-10 02:13:25 +08:00
yfyeung
ec6c8f748d
fix data prepare
...
update
2025-05-09 17:20:38 +00:00
Yifan Yang
489c42b45e
support zipformer encoder
...
update
update
update
update
fix
reformat
support infer
update
2025-05-08 14:44:09 +00:00
Yifan Yang
211c01bc1d
format train.py
...
minor fix train.py
2025-05-08 04:30:02 +00:00
Yifan Yang
23b5a7ce3e
format multi_dataset.py
2025-05-08 04:28:57 +00:00
Yifan Yang
9c8c4314de
init zipformer_llm_zh
2025-05-07 12:18:41 +00:00
Yifan Yang
dc07bba236
init
...
fix
2025-04-30 09:58:33 +00:00
Yifan Yang
cd7caf12df
Fix speech_llm recipe ( #1936 )
...
* fix training/decoding scripts, cleanup unused code, and ensure compliance with style checks
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2025-04-30 11:41:00 +08:00
Fangjun Kuang
cc2e64a6aa
Fix convert_texts_into_ids() in the tedlium3 recipe. ( #1929 )
2025-04-24 17:04:46 +08:00
Yifan Yang
5ec95e5482
Fix SpeechLLM recipe ( #1926 )
2025-04-23 16:18:38 +08:00
math345
64c5364085
Fix bug: When resuming training from a checkpoint, model_avg was not assigned, resulting in a None error. ( #1914 )
2025-04-10 11:37:28 +08:00
Fangjun Kuang
300a821f58
Fix aishell training ( #1916 )
2025-04-10 10:30:37 +08:00
Fangjun Kuang
171cf8c9fe
Avoid redundant computation in PiecewiseLinear. ( #1915 )
2025-04-09 11:52:37 +08:00
Wei Kang
86bd16d496
[KWS]Remove graph compiler ( #1905 )
2025-04-02 22:10:06 +08:00