Kinan Martin
5682978c64
switch mls_english clone from https to ssh
2025-08-05 18:33:03 +09:00
Kinan Martin
2265e1afed
fix stage 5 output pathing
2025-08-05 18:31:48 +09:00
Kinan Martin
7bea23e954
restore version of mls_english compute_fbank_mls_english.py and prepare.sh from commit 547f5c5
2025-08-05 18:30:41 +09:00
Bailey Hirota
8b035a0c96
remove bilingual tag from train.py
2025-08-05 18:29:29 +09:00
Kinan Martin
99db0e4643
deprecate params.bilingual=0, replace ReazonSpeechAsrDataModule for MultiDatasetAsrDataModule, not tested yet
2025-08-05 18:16:17 +09:00
Bailey Hirota
31a37c7e44
Revert "add fbank"
...
This reverts commit ba603e0a0a514056ec6d32677053c41743a1a5dd.
2025-08-05 18:15:04 +09:00
Bailey Hirota
7d462aa8b4
add fbank
2025-08-05 18:13:51 +09:00
Kinan Martin
06e429131b
new version of multi_ja_en prepare.sh script which swaps Librispeech for MLS English
2025-08-05 18:12:40 +09:00
Kinan Martin
0e86ef805c
optimize with num_jobs on save_audios
2025-08-05 18:11:22 +09:00
Kinan Martin
73dea24fd9
fix stage 2 and 3
2025-08-05 18:10:15 +09:00
Kinan Martin
2504b23861
fix validation manifest name
2025-08-05 18:09:00 +09:00
Kinan Martin
eb2168bc49
adjusted prepare.sh to only calculate fbank and manifest together; adjust datamodule to load from manifest files
2025-08-05 18:07:45 +09:00
Kinan Martin
a8f45bc08b
move compute_fbank_mls_english.py, add validate_manifest.py, add shared symlink to librispeech
2025-08-05 18:06:28 +09:00
Kinan Martin
fe88d1db36
instead of on-the-fly features, precompute fbank and manifests in prepare.sh
2025-08-05 18:05:18 +09:00
Kinan Martin
996334f520
readme
2025-08-05 18:04:04 +09:00
Kinan Martin
24db8c11ba
pre-commit hooks
2025-08-05 18:02:50 +09:00
Kinan Martin
c532a503e7
separate transcript prep stage from bpe train stage
2025-08-05 18:01:41 +09:00
Kinan Martin
313afea773
symlink copied files to librispeech recipe dir
2025-08-05 18:00:24 +09:00
Kinan Martin
e76b749450
cleaned-up version of recipe
2025-08-05 17:59:15 +09:00
Kinan Martin
1b8a3061b0
replace file
2025-08-05 17:57:59 +09:00
Kinan Martin
0ab027411f
change default path
2025-08-05 17:56:51 +09:00
Kinan Martin
ba6d8e8b26
update prepare.sh, fix asr_datamodule.py
2025-08-05 17:55:40 +09:00
Kinan Martin
c92c606c5f
WIP v0 MLS English recipe
2025-08-05 17:54:30 +09:00
Fangjun Kuang
1c5d792bf1
Validate generated manifest files. ( #338 )
2025-08-05 17:46:36 +09:00
Fangjun Kuang
e22bc78f98
Export streaming zipformer2 to RKNN ( #1977 )
2025-07-11 13:24:01 +08:00
Teo Wen Shen
da87e7fc99
add weights_only=False to torch.load ( #1984 )
2025-07-10 15:27:08 +08:00
Yifan Yang
89728dd4f8
Refactor data preparation for GigaSpeech recipe ( #1986 )
2025-07-10 11:17:37 +08:00
Mistmoon
9293edc62f
Add cr-ctc loss and ctc-decode in aishell ( #1980 )
2025-07-08 14:47:24 +08:00
Fangjun Kuang
fba5e67d5e
Fix CI tests. ( #1974 )
...
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle
deprecations in PyTorch ≥2.3.0
- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast
with the new utilities across all training and inference scripts
- Update all torch.load calls to include weights_only=False for compatibility with
newer PyTorch versions
2025-07-01 13:47:55 +08:00
Fangjun Kuang
71377d21cd
Export streaming zipformer models with whisper feature to onnx ( #1973 )
2025-06-30 19:01:15 +08:00
Fangjun Kuang
abd9437e6d
Add more wheels for piper-phonemize ( #1969 )
2025-06-24 14:49:16 +08:00
Wei Kang
e1cf4dbace
rm zipvoice ( #1967 )
2025-06-23 19:22:35 +08:00
Wei Kang
343b8fa2dc
Using non strict match in context graph for contextual words ( #1952 )
2025-06-19 12:27:15 +08:00
Wei Kang
f80a2ee110
Decrease num_buckets & remove shuffle_buffer_size ( #1955 )
2025-06-19 12:26:37 +08:00
Wei Kang
3587c4b3b7
Fix decoding byte bpes tokens to words. ( #1966 )
2025-06-19 12:26:01 +08:00
Wei Kang
762f965cf7
[zipvoice] Add requirements.txt and pinyin.txt, remove k2 from pretrained model inference. ( #1965 )
...
* Add requirements.txt and pinyin.txt needed by zipvoice
* simplify the requirements for pretrained model inference
2025-06-18 18:38:46 +08:00
Wei Kang
06539d2b9d
Add Zipvoice ( #1964 )
...
* Add ZipVoice - a flow-matching based zero-shot TTS model.
2025-06-17 20:17:12 +08:00
Zengwei Yao
ffb7d05635
refactor branch exchange in cr-ctc ( #1954 )
2025-05-27 12:09:59 +08:00
Mahsa Yarmohammadi
021e1a8846
Add acknowledgment to README ( #1950 )
2025-05-22 22:06:35 +08:00
Tianxiang Zhao
30e7ea4b5a
Fix a bug in finetune.py --use-mux ( #1949 )
2025-05-22 12:05:01 +08:00
Fangjun Kuang
fd8f8780fa
Fix logging torch.dtype. ( #1947 )
2025-05-21 12:04:57 +08:00
Yifan Yang
e79833aad2
ensure SwooshL/SwooshR output dtype matches input dtype ( #1940 )
2025-05-12 19:28:48 +08:00
Yifan Yang
4627969ccd
fix bug: undefined name 'partial' ( #1941 )
2025-05-12 14:19:53 +08:00
Yifan Yang
cd7caf12df
Fix speech_llm recipe ( #1936 )
...
* fix training/decoding scripts, cleanup unused code, and ensure compliance with style checks
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2025-04-30 11:41:00 +08:00
Fangjun Kuang
cc2e64a6aa
Fix convert_texts_into_ids() in the tedlium3 recipe. ( #1929 )
2025-04-24 17:04:46 +08:00
Yifan Yang
5ec95e5482
Fix SpeechLLM recipe ( #1926 )
2025-04-23 16:18:38 +08:00
math345
64c5364085
Fix bug: When resuming training from a checkpoint, model_avg was not assigned, resulting in a None error. ( #1914 )
2025-04-10 11:37:28 +08:00
Fangjun Kuang
300a821f58
Fix aishell training ( #1916 )
2025-04-10 10:30:37 +08:00
Fangjun Kuang
171cf8c9fe
Avoid redundant computation in PiecewiseLinear. ( #1915 )
2025-04-09 11:52:37 +08:00
Wei Kang
86bd16d496
[KWS]Remove graph compiler ( #1905 )
2025-04-02 22:10:06 +08:00