37 Commits

Author SHA1 Message Date
Teo Wen Shen
da87e7fc99
add weights_only=False to torch.load (#1984) 2025-07-10 15:27:08 +08:00
Fangjun Kuang
fba5e67d5e
Fix CI tests. (#1974)
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle 
  deprecations in PyTorch ≥2.3.0

- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast 
  with the new utilities across all training and inference scripts

- Update all torch.load calls to include weights_only=False for compatibility with 
  newer PyTorch versions
2025-07-01 13:47:55 +08:00
Fangjun Kuang
d4d4f281ec
Revert "Replace deprecated pytorch methods (#1814)" (#1841)
This reverts commit 3e4da5f78160d3dba3bdf97968bd7ceb8c11631f.
2024-12-18 16:49:57 +08:00
Li Peng
3e4da5f781
Replace deprecated pytorch methods (#1814)
* Replace deprecated pytorch methods

- torch.cuda.amp.GradScaler(...) => torch.amp.GradScaler("cuda", ...)
- torch.cuda.amp.autocast(...) => torch.amp.autocast("cuda", ...)

* Replace `with autocast(...)` with `with autocast("cuda", ...)`


Co-authored-by: Li Peng <lipeng@unisound.ai>
2024-12-16 10:24:16 +08:00
zr_jin
a394bf7474
fixed gss scripts for alimeeting and ami recipes (#1749) 2024-09-08 20:35:07 +08:00
zr_jin
65b8a6c730
fixed wrong default value for the alimeeting recipe (#1750) 2024-09-08 20:34:49 +08:00
zr_jin
559c8a7160
fixed a typo in prepare.sh for alimeeting recipes (#1747) 2024-09-08 17:10:17 +08:00
zr_jin
eb132da00d
additional instruction for the grad_scale is too small error (#1550) 2024-03-14 11:33:49 +08:00
BannerWang
959906e9dc
Correct alimeeting download link (#1544)
Co-authored-by: BannerWang <banner.wang@upblocks.io>
2024-03-12 12:44:09 +08:00
Yuekai Zhang
5df24c1685
Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483)
* add whisper fbank for wenetspeech

* add whisper fbank for other dataset

* add str to bool

* add decode for wenetspeech

* add requirments.txt

* add original model decode with 30s

* test feature extractor speed

* add aishell2 feat

* change compute feature batch

* fix overwrite

* fix executor

* regression

* add kaldifeatwhisper fbank

* fix io issue

* parallel jobs

* use multi machines

* add wenetspeech fine-tune scripts

* add monkey patch codes

* remove useless file

* fix subsampling factor

* fix too long audios

* add remove long short

* fix whisper version to support multi batch beam

* decode all wav files

* remove utterance more than 30s in test_net

* only test net

* using soft links

* add kespeech whisper feats

* fix index error

* add manifests for whisper

* change to licomchunky writer

* add missing option

* decrease cpu usage 

* add speed perturb for kespeech

* fix kespeech speed perturb

* add dataset

* load checkpoint from specific path

* add speechio

* add speechio results

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2024-03-07 19:04:27 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech (#1476)
* Comply to issue #1149

https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler (#1468)
* Fix buffer size

* Fix for flake8

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. (#1447)
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
zr_jin
10a234709c
bugs fixed (#1416) 2023-12-14 11:26:37 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse (#1314) 2023-10-17 21:22:32 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions (#1269)
* fixes for `diagnostics`

Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`

also black formatted some scripts

* fixed formatting issues
2023-09-24 17:06:47 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release (#1249) 2023-09-13 12:39:49 +08:00
zr_jin
74806b744b
disable speed perturbation by default (#1176)
* disable speed perturbation by default

* minor fixes

* minor updates

* updated bash scripts to incorporate with the `speed-perturb` arg

* minor fixes

1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe

>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)

2. changed arg type for `perturb-speed` to str2bool
2023-08-10 20:56:02 +08:00
Fangjun Kuang
7b0afbdc16
Remove cur_batch_idx (#1102) 2023-05-30 14:49:54 +08:00
Fangjun Kuang
f5de2e90c6
Fix style issues. (#937) 2023-03-08 22:56:04 +08:00
pehonnet
07243d136a
remove key from result filename (#936)
Co-authored-by: pe-honnet <pe.honnet@telepathy.ai>
2023-03-08 21:06:07 +08:00
Desh Raj
c4aaf3ea3b
Add AliMeeting multi-condition training recipe (#751)
* add AliMeeting multi-domain recipe

* convert scripts to symbolic links
2022-12-10 18:15:23 +08:00
marcoyang
53454701cb fix segmentation fault 2022-11-22 11:39:21 +08:00
Desh Raj
d31db01037 manual correction of black formatting 2022-11-17 14:18:05 -05:00
Desh Raj
107df3b115 apply black on all files 2022-11-17 09:42:17 -05:00
Fangjun Kuang
60317120ca
Revert "Apply new Black style changes" 2022-11-17 20:19:32 +08:00
Desh Raj
d110b04ad3 apply new black formatting to all files 2022-11-16 13:06:43 -05:00
Fangjun Kuang
d1f16a04bd
fix type hints for decode.py (#623) 2022-10-18 06:56:12 +08:00
Fangjun Kuang
e18fa78c3a
Check that read_manifests_if_cached returns a non-empty dict. (#555) 2022-08-28 11:50:11 +08:00
rickychanhoyin
2636a3dd58
minor changes for correct path names && import module text2segments.py (#552)
* Update asr_datamodule.py

minor file names correction

* minor changes for correct path names && import module text2segments.py
2022-08-27 17:23:45 +08:00
Wei Kang
5c17255eec
Sort results to make it more convenient to compare decoding results (#522)
* Sort result to make it more convenient to compare decoding results

* Add cut_id to recognition results

* add cut_id to results for all recipes

* Fix torch.jit.script

* Fix comments

* Minor fixes

* Fix torch.jit.tracing for Pytorch version before v1.9.0
2022-08-12 07:12:50 +08:00
Mingshuang Luo
998091ef52
do some changes for export.py (#437) 2022-06-20 14:57:08 +08:00
Fangjun Kuang
dbda1644b5
Replace load_manifest_lazy with load_manifest for MUSAN. (#412) 2022-06-09 11:42:18 +08:00
Fangjun Kuang
ed66877694
Replace ChunkedLilcomHdf5Writer with LilcomChunkyWriter. (#411) 2022-06-09 11:18:52 +08:00
Fangjun Kuang
f1abce72f8
Use jsonl for CutSet in the LibriSpeech recipe. (#397)
* Use jsonl for cutsets in the librispeech recipe.

* Use lazy cutset for all recipes.

* More fixes to use lazy CutSet.

* Remove force=True from logging to support Python < 3.8

* Minor fixes.

* Fix style issues.
2022-06-06 10:19:16 +08:00
Mingshuang Luo
e5884f82e0
[Ready to merge] Add prefix for compute fbank (#398)
* add prefix

* add prefix
2022-06-05 18:17:52 +08:00
Mingshuang Luo
beab229fd7
[Ready to merge] Pruned_transducer_stateless2 for alimeeting dataset (#378)
* add pruned-rnnt2 recipe for alimeeting dataset

* update code for merging

* change LilcomHdf5Writer to ChunkedLilcomHdf5Writer

* change for test.yml

* change for test.yml

* change for test.yml

* change for workflow yml

* change for yml

* change for yml

* change for README.md

* change for yml

* solve the conflicts

* solve the conflicts
2022-06-04 13:47:46 +08:00