icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Yifan Yang	c103dbef78	Merge 70f13e54d814761432acc1c23e9ef4ffd566df41 into 34fc1fdf0d8ff520e2bb18267d046ca207c78ef9	2025-07-19 14:22:04 +08:00
Fangjun Kuang	34fc1fdf0d	Fix transformer decoder layer (#1995 )	2025-07-18 20:12:29 +08:00
Bailey Machiko Hirota	5fe13078cc	Musan implementation for ReazonSpeech (#1988 )	2025-07-18 17:16:19 +08:00
Yifan Yang	9fd0f2dc1d	support left pad for make_pad_mask (#1990 )	2025-07-16 23:59:04 +08:00
Fangjun Kuang	e22bc78f98	Export streaming zipformer2 to RKNN (#1977 )	2025-07-11 13:24:01 +08:00
Teo Wen Shen	da87e7fc99	add weights_only=False to torch.load (#1984 )	2025-07-10 15:27:08 +08:00
Yifan Yang	89728dd4f8	Refactor data preparation for GigaSpeech recipe (#1986 )	2025-07-10 11:17:37 +08:00
Mistmoon	9293edc62f	Add cr-ctc loss and ctc-decode in aishell (#1980 )	2025-07-08 14:47:24 +08:00
Yifan Yang	70f13e54d8	Merge branch 'k2-fsa:master' into dev/speechllm	2025-07-07 11:32:12 +08:00
Fangjun Kuang	fba5e67d5e	Fix CI tests. (#1974 ) - Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle deprecations in PyTorch ≥2.3.0 - Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast with the new utilities across all training and inference scripts - Update all torch.load calls to include weights_only=False for compatibility with newer PyTorch versions	2025-07-01 13:47:55 +08:00
Fangjun Kuang	71377d21cd	Export streaming zipformer models with whisper feature to onnx (#1973 )	2025-06-30 19:01:15 +08:00
Fangjun Kuang	abd9437e6d	Add more wheels for piper-phonemize (#1969 )	2025-06-24 14:49:16 +08:00
Wei Kang	e1cf4dbace	rm zipvoice (#1967 )	2025-06-23 19:22:35 +08:00
Wei Kang	343b8fa2dc	Using non strict match in context graph for contextual words (#1952 )	2025-06-19 12:27:15 +08:00
Wei Kang	f80a2ee110	Decrease num_buckets & remove shuffle_buffer_size (#1955 )	2025-06-19 12:26:37 +08:00
Wei Kang	3587c4b3b7	Fix decoding byte bpes tokens to words. (#1966 )	2025-06-19 12:26:01 +08:00
Yifan Yang	56349001d6	Merge branch 'k2-fsa:master' into dev/speechllm	2025-06-18 21:09:44 +08:00
Wei Kang	762f965cf7	[zipvoice] Add requirements.txt and pinyin.txt, remove k2 from pretrained model inference. (#1965 ) * Add requirements.txt and pinyin.txt needed by zipvoice * simplify the requirements for pretrained model inference	2025-06-18 18:38:46 +08:00
yfyeung	53111d0e46	fix for multigpu	2025-06-18 07:33:15 +00:00
yfyeung	39d90356fe	fix deepspeed config fix	2025-06-18 05:04:00 +00:00
Yifan Yang	c571a88b59	Merge branch 'k2-fsa:master' into dev/speechllm	2025-06-18 12:29:27 +08:00
Yifan Yang	34639d5249	use padding instead of trimming (suggested by @shylockasr) use ctc compress (suggested by @shylockasr) fix revert revert revert	2025-06-18 04:25:30 +00:00
Zengwei Yao	05e3094429	refactor branch exchange in cr-ctc (#1954 )	2025-06-18 04:25:15 +00:00
Wei Kang	06539d2b9d	Add Zipvoice (#1964 ) * Add ZipVoice - a flow-matching based zero-shot TTS model.	2025-06-17 20:17:12 +08:00
yfyeung	7c30dd570b	restrict deepspeed >=0.16.9	2025-05-28 03:42:03 +00:00
Zengwei Yao	ffb7d05635	refactor branch exchange in cr-ctc (#1954 )	2025-05-27 12:09:59 +08:00
yfyeung	11ccaa3ab8	add requirements.txt	2025-05-26 04:11:28 +00:00
Yifan Yang	d1a535dc76	Merge branch 'k2-fsa:master' into dev/speechllm	2025-05-24 13:13:42 +08:00
Mahsa Yarmohammadi	021e1a8846	Add acknowledgment to README (#1950 )	2025-05-22 22:06:35 +08:00
Tianxiang Zhao	30e7ea4b5a	Fix a bug in finetune.py --use-mux (#1949 )	2025-05-22 12:05:01 +08:00
Fangjun Kuang	fd8f8780fa	Fix logging torch.dtype. (#1947 )	2025-05-21 12:04:57 +08:00
Yifan Yang	24b6f42340	fix typos in docs fix typo in RESULTS.md Update RESULTS.md	2025-05-13 14:51:17 +08:00
yifanyeung	62dfe56cbe	restore checkpoint save after validation	2025-05-13 06:14:59 +00:00
yfyeung	06667e1f6d	add batch shave mechanism fix fix	2025-05-12 17:39:15 +00:00
Yifan Yang	ea20ac208d	Merge branch 'k2-fsa:master' into dev/speechllm	2025-05-12 20:31:41 +08:00
Yifan Yang	e79833aad2	ensure SwooshL/SwooshR output dtype matches input dtype (#1940 )	2025-05-12 19:28:48 +08:00
Yifan Yang	c709ce433d	Merge branch 'k2-fsa:master' into dev/speechllm	2025-05-12 14:38:13 +08:00
yfyeung	2793ccdf56	remove checkpoint save after validation	2025-05-12 06:36:20 +00:00
Yifan Yang	4627969ccd	fix bug: undefined name 'partial' (#1941 )	2025-05-12 14:19:53 +08:00
yfyeung	c078772e59	skip OOM	2025-05-11 17:23:19 +00:00
yfyeung	9939c2b72d	remove duplicated torch autocast	2025-05-11 17:03:44 +00:00
Yifan Yang	5fbeed9f96	fix SwooshR and SwooshL	2025-05-12 00:48:42 +08:00
yfyeung	cd3adad46d	use quadratic-duration	2025-05-10 17:47:30 +00:00
yfyeung	c75767f600	set world_size and rank explicitly update	2025-05-10 17:47:28 +00:00
Yifan Yang	2420d0c95f	update multi_dataset.py	2025-05-10 02:13:25 +08:00
yfyeung	ec6c8f748d	fix data prepare update	2025-05-09 17:20:38 +00:00
Yifan Yang	489c42b45e	support zipformer encoder update update update update fix reformat support infer update	2025-05-08 14:44:09 +00:00
Yifan Yang	211c01bc1d	format train.py minor fix train.py	2025-05-08 04:30:02 +00:00
Yifan Yang	23b5a7ce3e	format multi_dataset.py	2025-05-08 04:28:57 +00:00
Yifan Yang	9c8c4314de	init zipformer_llm_zh	2025-05-07 12:18:41 +00:00

1 2 3 4 5 ...

1248 Commits