icefall

Archived

Author	SHA1	Message	Date
Yifan Yang	482c24eab0	Merge pull request #2 from zzasdf/k2ssl-util checkpoint convert script	2024-03-19 17:44:51 +08:00
zzasdf	ac73f60f5f	format	2024-03-19 17:34:06 +08:00
zzasdf	952abee88c	add checkpoint convert script	2024-03-19 17:29:34 +08:00
Yifan Yang	ea0b6311f1	Merge branch 'k2-fsa:master' into k2ssl	2024-03-13 19:52:14 +08:00
Fangjun Kuang	15bd9a841e	add CI for ljspeech (#1548 )	2024-03-13 17:39:01 +08:00
Fangjun Kuang	d406b41cbd	Doc: Add page for installing piper-phonemize (#1547 )	2024-03-13 11:01:18 +08:00
zr_jin	c3f6f28116	Zipformer recipe for Cantonese dataset MDCC (#1537 ) * init commit * Create README.md * handle code switching cases * misc. fixes * added manifest statistics * init commit for the zipformer recipe * added scripts for exporting model * added RESULTS.md * added scripts for streaming related stuff * doc str fixed	2024-03-13 10:01:28 +08:00
yifanyeung	9321f8ab7a	add librilight	2024-03-13 00:14:20 +08:00
Fangjun Kuang	81f518ea7c	Support different tts model types. (#1541 )	2024-03-12 22:29:21 +08:00
BannerWang	959906e9dc	Correct alimeeting download link (#1544 ) Co-authored-by: BannerWang <banner.wang@upblocks.io>	2024-03-12 12:44:09 +08:00
jimmy1984xu	e472fa6840	fix CutMix init parameter (#1543 ) Co-authored-by: jimmyxu <jimmyxu@upblocks.io>	2024-03-11 18:37:26 +08:00
Yifan Yang	660f647886	Merge branch 'k2-fsa:master' into k2ssl	2024-03-10 13:10:36 +08:00
Fangjun Kuang	60986c3ac1	Fix default value for --context-size in icefall. (#1538 )	2024-03-08 20:47:13 +08:00
zr_jin	ae61bd4090	Minor fixes for the `commonvoice` recipe (#1534 ) * init commit * fix for issue https://github.com/k2-fsa/icefall/issues/1531 * minor fixes	2024-03-08 11:01:11 +08:00
Yuekai Zhang	5df24c1685	Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483 ) * add whisper fbank for wenetspeech * add whisper fbank for other dataset * add str to bool * add decode for wenetspeech * add requirments.txt * add original model decode with 30s * test feature extractor speed * add aishell2 feat * change compute feature batch * fix overwrite * fix executor * regression * add kaldifeatwhisper fbank * fix io issue * parallel jobs * use multi machines * add wenetspeech fine-tune scripts * add monkey patch codes * remove useless file * fix subsampling factor * fix too long audios * add remove long short * fix whisper version to support multi batch beam * decode all wav files * remove utterance more than 30s in test_net * only test net * using soft links * add kespeech whisper feats * fix index error * add manifests for whisper * change to licomchunky writer * add missing option * decrease cpu usage * add speed perturb for kespeech * fix kespeech speed perturb * add dataset * load checkpoint from specific path * add speechio * add speechio results --------- Co-authored-by: zr_jin <peter.jin.cn@gmail.com>	2024-03-07 19:04:27 +08:00
zr_jin	cdb3fb5675	add text norm script for pl (#1532 )	2024-03-07 18:47:29 +08:00
zr_jin	335a9962de	Fixed formatting issue of PR #1528 (#1530 )	2024-03-06 08:43:45 +08:00
Rezakh20	ff430b465f	Add num_features to train.py for training WSASR (#1528 )	2024-03-05 16:40:30 +08:00
zr_jin	242002e0bd	Strengthened style constraints (#1527 )	2024-03-04 23:28:04 +08:00
yifanyeung	bed950dbcb	update comments	2024-03-01 20:14:23 +08:00
Fangjun Kuang	29b195a42e	Update export-onnx.py for vits to support sherpa-onnx. (#1524 )	2024-03-01 19:53:58 +08:00
zr_jin	58610b1bf6	Provides `README.md` for TTS recipes (#1491 ) * Update README.md	2024-02-29 17:31:28 +08:00
Fangjun Kuang	2f102eb989	Add CUDA docker image for torch 2.2.1 (#1521 )	2024-02-29 11:41:18 +08:00
Xiaoyu Yang	7e2b561bbf	Add recipe for fine-tuning Zipformer with adapter (#1512 )	2024-02-29 10:57:38 +08:00
Zengwei Yao	d89f4ea149	Use piper_phonemize as text tokenizer in ljspeech recipe (#1511 ) * use piper_phonemize as text tokenizer in ljspeech recipe * modify usage of tokenizer in vits/train.py * update docs	2024-02-29 10:13:22 +08:00
yifanyeung	99044e1c2b	Fix bugs in HubertDataset	2024-02-27 22:13:39 +08:00
Yifan Yang	8515d92f47	Update ssl_datamodule.py	2024-02-27 10:54:54 +08:00
Yifan Yang	bb266b7ef8	Merge branch 'k2-fsa:master' into k2ssl	2024-02-27 10:48:48 +08:00
Fangjun Kuang	291d06056c	Support torch 2.2.1 for cpu docker. (#1516 )	2024-02-23 14:24:13 +08:00
Yifan Yang	f2f102dae7	Merge branch 'k2-fsa:master' into k2ssl	2024-02-22 17:45:42 +08:00
Xiaoyu Yang	2483b8b4da	Zipformer recipe for SPGISpeech (#1449 )	2024-02-22 15:53:19 +08:00
Wei Kang	819bb45539	Add pypinyin to requirements (#1515 )	2024-02-22 15:50:11 +08:00
Wei Kang	aac7df064a	Recipes for open vocabulary keyword spotting (#1428 ) * English recipe on gigaspeech; Chinese recipe on wenetspeech	2024-02-22 15:31:20 +08:00
Xiaoyu Yang	13daf73468	docs for finetune zipformer (#1509 )	2024-02-21 18:06:27 +08:00
Wei Kang	c19b414778	Update docker (adding pypinyin (#1513 ) Update docker (adding pypinyin)	2024-02-21 08:04:16 +08:00
zr_jin	027302c902	minor fix for param. names (#1495 )	2024-02-20 14:38:51 +08:00
Karel Vesely	e59fa38e86	docs: minor fixes of LM rescoring texts (#1498 )	2024-02-20 10:40:15 +08:00
Zengwei Yao	b3e2044068	minor fix of vits/tokenizer.py (#1504 ) * minor fix of vits/tokenizer.py	2024-02-19 19:33:32 +08:00
zr_jin	db4d66c0e3	Fixed softlink for `ljspeech` recipe (#1503 )	2024-02-19 16:13:09 +08:00
Fangjun Kuang	7eb360d0d5	Fix cpu docker images for torch 2.2.0 (#1502 )	2024-02-18 20:32:40 +08:00
Fangjun Kuang	17688476e5	Provider docker images for torch 2.2.0 (#1501 )	2024-02-18 14:56:04 +08:00
yifanyeung	911bfacffd	fix for black	2024-02-18 13:24:02 +08:00
yifanyeung	c0a5601c3d	fix for black	2024-02-18 13:15:56 +08:00
yifanyeung	809bdb07f0	fix for black	2024-02-18 12:44:44 +08:00
yifanyeung	b070d04ae8	fix flake8	2024-02-18 12:36:47 +08:00
Fangjun Kuang	06b356a610	Update cpu docker images to support torch 2.2.0 (#1499 )	2024-02-18 12:05:38 +08:00
yifanyeung	a2bf39a531	Add k2SSL	2024-02-18 11:44:26 +08:00
safarisadegh	d9ae8c02a0	Update README.md (#1497 )	2024-02-09 15:05:01 +08:00
Wei Kang	711d6bc462	Refactor prepare.sh in librispeech (#1493 ) * Refactor prepare.sh in librispeech, break it into three parts, prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).	2024-02-09 10:44:19 +08:00
Tiance Wang	4ed88d9484	Update shared (#1487 ) There should be one more ../	2024-02-07 10:16:02 +08:00

1 2 3 4 5 ...

1064 Commits