1236 Commits

Author SHA1 Message Date
Bailey Hirota
6d71d9cff4 remove bilingual tag from train.py 2025-07-28 17:52:28 +09:00
Kinan Martin
3751441dad deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet 2025-07-28 17:49:35 +09:00
Bailey Hirota
61e81bfc26 Revert "add fbank"
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
2025-07-28 17:49:35 +09:00
Bailey Hirota
c83b115b49 add fbank 2025-07-28 17:49:35 +09:00
Kinan Martin
abebb6aaf0 new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English 2025-07-28 17:49:35 +09:00
Kinan Martin
fa84782b21 optimize with num_jobs on save_audios 2025-07-28 17:49:35 +09:00
Kinan Martin
f2e01712de fix stage 2 and 3 2025-07-28 17:49:35 +09:00
Kinan Martin
59519a41fa fix validation manifest name 2025-07-28 17:49:35 +09:00
Kinan Martin
4ca8ee94f0 adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files 2025-07-28 17:49:35 +09:00
Kinan Martin
d6e3c98e58 move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech 2025-07-28 17:49:35 +09:00
Kinan Martin
68e3ceaaac instead of on-the-fly features, precompute fbank and manifests in prepare.sh 2025-07-28 17:49:35 +09:00
Kinan Martin
ce44150e25 readme 2025-07-28 17:49:35 +09:00
Kinan Martin
a34d34a38e pre-commit hooks 2025-07-28 17:49:35 +09:00
Kinan Martin
898525962c separate transcript prep stage from bpe train stage 2025-07-28 17:49:35 +09:00
Kinan Martin
8c1c7100d3 symlink copied files to librispeech recipe dir 2025-07-28 17:49:35 +09:00
Kinan Martin
efe015d568 cleaned-up version of recipe 2025-07-28 17:49:35 +09:00
Kinan Martin
defc71bc6a replace file 2025-07-28 17:49:35 +09:00
Kinan Martin
a1fc6420f9 change default path 2025-07-28 17:49:35 +09:00
Kinan Martin
ac0c0edddb update prepare.sh, fix asr_datamodule.py 2025-07-28 17:49:35 +09:00
Kinan Martin
28f65458b3 WIP v0 MLS English recipe 2025-07-28 17:49:35 +09:00
Fangjun Kuang
e22bc78f98
Export streaming zipformer2 to RKNN (#1977) 2025-07-11 13:24:01 +08:00
Teo Wen Shen
da87e7fc99
add weights_only=False to torch.load (#1984) 2025-07-10 15:27:08 +08:00
Yifan Yang
89728dd4f8
Refactor data preparation for GigaSpeech recipe (#1986) 2025-07-10 11:17:37 +08:00
Mistmoon
9293edc62f
Add cr-ctc loss and ctc-decode in aishell (#1980) 2025-07-08 14:47:24 +08:00
Fangjun Kuang
fba5e67d5e
Fix CI tests. (#1974)
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle 
  deprecations in PyTorch ≥2.3.0

- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast 
  with the new utilities across all training and inference scripts

- Update all torch.load calls to include weights_only=False for compatibility with 
  newer PyTorch versions
2025-07-01 13:47:55 +08:00
Fangjun Kuang
71377d21cd
Export streaming zipformer models with whisper feature to onnx (#1973) 2025-06-30 19:01:15 +08:00
Fangjun Kuang
abd9437e6d
Add more wheels for piper-phonemize (#1969) 2025-06-24 14:49:16 +08:00
Wei Kang
e1cf4dbace
rm zipvoice (#1967) 2025-06-23 19:22:35 +08:00
Wei Kang
343b8fa2dc
Using non strict match in context graph for contextual words (#1952) 2025-06-19 12:27:15 +08:00
Wei Kang
f80a2ee110
Decrease num_buckets & remove shuffle_buffer_size (#1955) 2025-06-19 12:26:37 +08:00
Wei Kang
3587c4b3b7
Fix decoding byte bpes tokens to words. (#1966) 2025-06-19 12:26:01 +08:00
Wei Kang
762f965cf7
[zipvoice] Add requirements.txt and pinyin.txt, remove k2 from pretrained model inference. (#1965)
* Add requirements.txt and pinyin.txt needed by zipvoice

* simplify the requirements for pretrained model inference
2025-06-18 18:38:46 +08:00
Wei Kang
06539d2b9d
Add Zipvoice (#1964)
* Add ZipVoice - a flow-matching based zero-shot TTS model.
2025-06-17 20:17:12 +08:00
Zengwei Yao
ffb7d05635
refactor branch exchange in cr-ctc (#1954) 2025-05-27 12:09:59 +08:00
Mahsa Yarmohammadi
021e1a8846
Add acknowledgment to README (#1950) 2025-05-22 22:06:35 +08:00
Tianxiang Zhao
30e7ea4b5a
Fix a bug in finetune.py --use-mux (#1949) 2025-05-22 12:05:01 +08:00
Fangjun Kuang
fd8f8780fa
Fix logging torch.dtype. (#1947) 2025-05-21 12:04:57 +08:00
Yifan Yang
e79833aad2
ensure SwooshL/SwooshR output dtype matches input dtype (#1940) 2025-05-12 19:28:48 +08:00
Yifan Yang
4627969ccd
fix bug: undefined name 'partial' (#1941) 2025-05-12 14:19:53 +08:00
Yifan Yang
cd7caf12df
Fix speech_llm recipe (#1936)
* fix training/decoding scripts, cleanup unused code, and ensure compliance with style checks

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2025-04-30 11:41:00 +08:00
Fangjun Kuang
cc2e64a6aa
Fix convert_texts_into_ids() in the tedlium3 recipe. (#1929) 2025-04-24 17:04:46 +08:00
Yifan Yang
5ec95e5482
Fix SpeechLLM recipe (#1926) 2025-04-23 16:18:38 +08:00
math345
64c5364085
Fix bug: When resuming training from a checkpoint, model_avg was not assigned, resulting in a None error. (#1914) 2025-04-10 11:37:28 +08:00
Fangjun Kuang
300a821f58
Fix aishell training (#1916) 2025-04-10 10:30:37 +08:00
Fangjun Kuang
171cf8c9fe
Avoid redundant computation in PiecewiseLinear. (#1915) 2025-04-09 11:52:37 +08:00
Wei Kang
86bd16d496
[KWS]Remove graph compiler (#1905) 2025-04-02 22:10:06 +08:00
Fangjun Kuang
db9fb8ad31
Add scripts to export streaming zipformer(v1) to RKNN (#1882) 2025-02-27 17:10:58 +08:00
Yuekai Zhang
2ba665abca
Add F5-TTS with semantic token training results (#1880)
* add cosy token

* update inference code

* add extract cosy token

* update results

* add requirements.txt

* update readme

---------

Co-authored-by: yuekaiz <yuekaiz@h20-7.cm.cluster>
Co-authored-by: yuekaiz <yuekaiz@mgmt1-login.cm.cluster>
2025-02-24 13:58:47 +08:00
Machiko Bailey
da597ad782
Update RESULTS.md (#1873) 2025-02-04 09:04:25 +08:00
Machiko Bailey
0855b0338a
Merge japanese-to-english multilingual branch (#1860)
* add streaming support to reazonresearch

* update README for streaming

* Update RESULTS.md

* add onnx decode

---------

Co-authored-by: root <root@KDA03.cm.cluster>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: root <root@KDA01.cm.cluster>
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2025-02-04 01:33:09 +08:00