Teo Wen Shen
da87e7fc99
add weights_only=False to torch.load ( #1984 )
2025-07-10 15:27:08 +08:00
Fangjun Kuang
fba5e67d5e
Fix CI tests. ( #1974 )
...
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle
deprecations in PyTorch ≥2.3.0
- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast
with the new utilities across all training and inference scripts
- Update all torch.load calls to include weights_only=False for compatibility with
newer PyTorch versions
2025-07-01 13:47:55 +08:00
Fangjun Kuang
fd8f8780fa
Fix logging torch.dtype. ( #1947 )
2025-05-21 12:04:57 +08:00
Machiko Bailey
0855b0338a
Merge japanese-to-english multilingual branch ( #1860 )
...
* add streaming support to reazonresearch
* update README for streaming
* Update RESULTS.md
* add onnx decode
---------
Co-authored-by: root <root@KDA03.cm.cluster>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: root <root@KDA01.cm.cluster>
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2025-02-04 01:33:09 +08:00
Han Zhu
ab91112909
Improve infinity-check ( #1862 )
...
1. Attach the inf-check hooks if the grad scale is getting too small.
2. Add try-catch to avoid OOM in the inf-check hooks.
3. Set warmup_start=0.1 to reduce chances of divergence
2025-01-09 15:05:38 +08:00
Han Zhu
df46a3eaf9
Warn instead of raising exceptions in inf-check ( #1852 )
2024-12-31 16:52:06 +08:00
Han Zhu
57e9f2a8db
Add the "rms-sort" diagnostics ( #1851 )
2024-12-30 15:27:05 +08:00
Fangjun Kuang
d4d4f281ec
Revert "Replace deprecated pytorch methods ( #1814 )" ( #1841 )
...
This reverts commit 3e4da5f78160d3dba3bdf97968bd7ceb8c11631f.
2024-12-18 16:49:57 +08:00
Li Peng
3e4da5f781
Replace deprecated pytorch methods ( #1814 )
...
* Replace deprecated pytorch methods
- torch.cuda.amp.GradScaler(...) => torch.amp.GradScaler("cuda", ...)
- torch.cuda.amp.autocast(...) => torch.amp.autocast("cuda", ...)
* Replace `with autocast(...)` with `with autocast("cuda", ...)`
Co-authored-by: Li Peng <lipeng@unisound.ai>
2024-12-16 10:24:16 +08:00
zr_jin
87cadfcd2e
fixed formatting issue ( #1791 )
...
* isort fixed formatting issue
2024-10-30 21:14:12 +08:00
Wei Kang
d513d456b8
Add prefix beam search and corresponding decoding methods ( #1786 )
...
* Add prefix beam search / shallow fussion / hotwords in librispeech ctc decode
* Add librispeech cr-ctc prefix beam search results
2024-10-30 10:14:34 +08:00
Fangjun Kuang
516b4869b3
Add Matcha-TTS ( #1773 )
2024-10-29 15:04:04 +08:00
zr_jin
88bacfb9e6
minor fixes for the repo ( #1775 )
...
* minor fixes for the repo
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-10-21 13:51:56 +08:00
Zengwei Yao
693d84a301
Add Consistency-Regularized CTC ( #1766 )
...
* support consistency-regularized CTC
* update arguments of cr-ctc
* set default value of cr_loss_masked_scale to 1.0
* minor fix
* refactor codes
* update RESULTS.md
2024-10-21 10:35:26 +08:00
zzasdf
2653df5bda
fix the mismatch in batch_idx_train ( #1757 )
2024-10-12 19:14:28 +08:00
Fangjun Kuang
2e13298717
Refactor ctc greedy search. ( #1691 )
...
Use torch.unique_consecutive() to avoid reinventing the wheel.
2024-07-15 12:01:47 +08:00
Zengwei Yao
d47c078286
add decoding method of ctc-greedy-search in zipformer recipe ( #1690 )
2024-07-14 17:30:13 +08:00
Zengwei Yao
f76afff741
Support CTC/AED option for Zipformer recipe ( #1389 )
...
* add attention-decoder loss option for zipformer recipe
* add attention-decoder-rescoring
* update export.py and pretrained_ctc.py
* update RESULTS.md
2024-07-05 20:19:18 +08:00
Fangjun Kuang
13f55d0735
Add merge_tokens for ctc forced alignment ( #1649 )
2024-06-12 17:45:13 +08:00
Daniel Povey
4d5c1f2e60
Remove inf from stored stats ( #1647 )
2024-06-10 22:41:54 +08:00
zr_jin
42a97f6d7b
Update env.py ( #1635 )
2024-05-22 22:29:38 +08:00
Yifan Yang
4e97b19b63
Remove duplicate logging initialization logic in utils.py ( #1617 )
2024-05-06 13:00:27 +08:00
Zengwei Yao
c08fe48603
add force=True to logging.basicConfig ( #1613 )
2024-05-04 11:42:23 +08:00
Dongji Gao
9a17f4ce41
add OTC related scripts using phone as units instead of BPEs ( #1602 )
...
* add otc related scripts using phone instead of bpe
2024-04-26 00:55:44 +08:00
Yifan Yang
368b7d10a7
clear log handlers before setup ( #1603 )
2024-04-24 15:31:25 +09:00
zr_jin
d5cd78a637
Update hooks.py ( #1564 )
2024-03-20 16:43:45 +08:00
zr_jin
9bd30853ae
Update diagnostics.py ( #1562 )
2024-03-20 15:35:14 +08:00
zr_jin
413220d6a4
Minor fixes for the multi_zh_en
recipe ( #1526 )
2024-03-18 20:25:57 +08:00
zr_jin
eb132da00d
additional instruction for the grad_scale is too small
error ( #1550 )
2024-03-14 11:33:49 +08:00
zr_jin
242002e0bd
Strengthened style constraints ( #1527 )
2024-03-04 23:28:04 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting ( #1428 )
...
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
zr_jin
027302c902
minor fix for param. names ( #1495 )
2024-02-20 14:38:51 +08:00
safarisadegh
d9ae8c02a0
Update README.md ( #1497 )
2024-02-09 15:05:01 +08:00
Henry Li Xinyuan
b07d5472c5
Implement recipe for Fluent Speech Commands dataset ( #1469 )
...
---------
Signed-off-by: Xinyuan Li <xli257@c13.clsp.jhu.edu>
2024-01-31 22:53:36 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 ( #1466 )
...
* add decode seamlessm4t
* add requirements
* add decoding with avg model
* add token files
* add custom tokenizer
* support deepspeed to finetune large model
* support large-v3
* add model saving
* using monkey patch to replace models
* add manifest dir option
2024-01-27 00:32:30 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords ( #1385 )
...
* add custom score for each hotword
* Add more comments
* Fix deocde
* fix style
* minor fixes
2023-11-18 18:47:55 +08:00
zr_jin
6d275ddf9f
fixed broken softlinks ( #1381 )
...
* removed broken softlinks
* fixed dependencies
* fixed file permission
2023-11-10 14:45:16 +08:00
lishaojie
1b2e99d374
add the pruned_transducer_stateless7_streaming recipe for commonvoice ( #1018 )
...
* add the pruned_transducer_stateless7_streaming recipe for commonvoice
* fix the symlinks
* Update RESULTS.md
2023-11-09 22:07:28 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs ( #1354 )
...
* incorporate https://github.com/k2-fsa/icefall/pull/1269
* incorporate https://github.com/k2-fsa/icefall/pull/1301
* black formatted
* incorporate https://github.com/k2-fsa/icefall/pull/1162
* black formatted
2023-10-31 10:28:20 +08:00
Rudra
eef47adee9
fix typo ( #1324 )
2023-10-19 22:54:43 +08:00
Daniel Povey
973dc1026d
Make diagnostics.py more error-tolerant and have wider range of supported torch versions ( #1234 )
2023-10-19 22:54:00 +08:00
Karel Vesely
543b4cc1ca
small enhanecements ( #1322 )
...
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
errs)
2023-10-19 21:53:31 +08:00
Surav Shrestha
36c60b0cf6
fix typos in icefall/utils.py ( #1319 )
2023-10-19 11:15:18 +08:00
marcoyang1998
16a2748d6c
PromptASR for contextualized ASR with controllable style ( #1250 )
...
* Add PromptASR with BERT as text encoder
* Support using word-list based content prompts for context biasing
* Upload the pretrained models to huggingface
* Add usage example
2023-10-11 14:56:41 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc ( #1279 )
2023-10-01 13:46:16 +08:00
Dongji Gao
3abc290c11
Add scripts and recipe for BTC/OTC ( #1255 )
2023-09-29 07:52:46 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe ( #1270 )
...
* formatted the entire librispeech recipe
* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
3199058194
enable sclite_mode
for swbd scoring ( #1239 )
2023-09-09 21:25:26 +08:00
Wei Kang
4d7f73ce65
Add context biasing for zipformer recipe ( #1204 )
...
* Add context biasing for zipformer recipe
* support context biasing in modified_beam_search_LODR
* fix context graph
* Minor fixes
2023-08-28 19:37:32 +08:00