icefall

Author	SHA1	Message	Date
root	b26d3fa596	add logging	2024-06-14 12:08:50 +08:00
root	4ebccebcc0	removing debug log	2024-06-14 12:08:50 +08:00
root	271536248f	fix decoding issue and padding to longest	2024-06-14 12:08:50 +08:00
root	eb2c255e1e	remove position ids	2024-06-14 12:08:50 +08:00
root	639feab4df	update dataset with aishell 2	2024-06-14 12:08:50 +08:00
root	8afb0d647f	fix template	2024-06-14 12:08:50 +08:00
Yuekai Zhang	16f18080be	update prompt for decoding	2024-06-14 12:08:50 +08:00
Yuekai Zhang	40e4ac480c	change prompt	2024-06-14 12:08:50 +08:00
Yuekai Zhang	68b99f456f	fix debug	2024-06-14 12:08:50 +08:00
Yuekai Zhang	8bbd06112a	add decode log	2024-06-14 12:08:50 +08:00
root	412e926941	fix down sample method	2024-06-14 12:08:50 +08:00
root	796663066f	mask unrelated labels	2024-06-14 12:08:50 +08:00
Yuekai Zhang	3ac27d5ad4	fix requirements	2024-06-14 12:08:26 +08:00
Yuekai Zhang	09ec0d6553	add requirements.txt	2024-06-14 12:08:26 +08:00
Yuekai Zhang	19b5b86f9b	fix decoding issues	2024-06-14 12:08:26 +08:00
Yuekai Zhang	3dbbc29429	add decode file	2024-06-14 12:08:26 +08:00
Yuekai Zhang	b5a906cbbd	fix bugs	2024-06-14 12:08:26 +08:00
Yuekai Zhang	e495c9d732	add whisper llm	2024-06-14 12:08:26 +08:00
Triplecq	3b40d9bbb1	Zipformer recipe for ReazonSpeech (#1611 ) * Add first cut at ReazonSpeech recipe This recipe is mostly based on egs/csj, but tweaked to the point that can be run with ReazonSpeech corpus. Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net> --------- Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net> Co-authored-by: Fujimoto Seiji <fujimoto@ceptord.net> Co-authored-by: Chen <qc@KDM00.cm.cluster> Co-authored-by: root <root@KDA01.cm.cluster>	2024-06-13 14:19:03 +08:00
Yuekai Zhang	d5be739639	add distill whisper results (#1648 )	2024-06-13 00:20:04 +08:00
Fangjun Kuang	13f55d0735	Add merge_tokens for ctc forced alignment (#1649 )	2024-06-12 17:45:13 +08:00
Fangjun Kuang	ec0389a3c1	Add doc about FST-based CTC forced alignment. (#1482 )	2024-06-12 17:36:57 +08:00
Daniel Povey	4d5c1f2e60	Remove inf from stored stats (#1647 )	2024-06-10 22:41:54 +08:00
Fangjun Kuang	130a18cc10	support torch 2.3.1 in docker (#1646 )	2024-06-06 22:27:29 +08:00
Fangjun Kuang	b88062292b	Typo fixes (#1643 )	2024-06-03 16:49:21 +08:00
zr_jin	42a97f6d7b	Update env.py (#1635 )	2024-05-22 22:29:38 +08:00
zr_jin	1adf1e441d	Removed unused ``k2`` dependencies from the AT recipe (#1633 )	2024-05-21 18:22:19 +08:00
Zengwei Yao	0df406c5da	Initialize BiasNorm bias with small random values (#1630 )	2024-05-20 22:32:02 +08:00
zr_jin	68980c5d0a	Fix an error occured during mmi preparation (#1626 ) * init commit * updated	2024-05-17 19:45:15 +08:00
zr_jin	9d570870cf	Update asr_datamodule.py (#1619 )	2024-05-07 21:37:55 +08:00
Yifan Yang	4e97b19b63	Remove duplicate logging initialization logic in utils.py (#1617 )	2024-05-06 13:00:27 +08:00
Zengwei Yao	c08fe48603	add force=True to logging.basicConfig (#1613 )	2024-05-04 11:42:23 +08:00
Yuekai Zhang	6d7c1d13a5	update speechio whisper ft results (#1605 ) * update speechio whisper ft results	2024-04-30 11:49:20 +08:00
Wei Kang	b49351fc39	Update README.md for conformer-ctc (#1609 )	2024-04-28 09:56:13 +08:00
Dongji Gao	9a17f4ce41	add OTC related scripts using phone as units instead of BPEs (#1602 ) * add otc related scripts using phone instead of bpe	2024-04-26 00:55:44 +08:00
zzasdf	25cabb7663	fix error in padding computing (#1607 )	2024-04-25 22:40:07 +08:00
Xiaoyu Yang	df36f93bd8	add small-scaled model for audio tagging (#1604 )	2024-04-24 17:00:42 +08:00
Yifan Yang	368b7d10a7	clear log handlers before setup (#1603 )	2024-04-24 15:31:25 +09:00
zr_jin	9f8f0bceb5	Update prepare.sh (#1601 )	2024-04-20 23:02:02 +09:00
Yifan Yang	ed6bc200e3	Update train.py (#1590 )	2024-04-11 19:35:25 +08:00
Fangjun Kuang	ba5b2e854b	Return probs in audio tagging onnx models (#1586 )	2024-04-10 09:03:30 +08:00
Fangjun Kuang	fa5d861af0	Add CI test for the AudioSet recipe. (#1585 )	2024-04-09 17:45:00 +08:00
yh646492956	f5d7818733	fix run.sh script in wenetspeech KWS (#1584 ) Co-authored-by: Hao You <13182720519@sina.cn>	2024-04-09 15:16:12 +08:00
Xiaoyu Yang	1732dafe24	Add zipformer recipe for audio tagging (#1421 )	2024-04-09 12:06:14 +08:00
zr_jin	f2e36ec414	Zipformer recipe for CommonVoice (#1546 ) * added scripts for char-based lang prep training scripts * added `Zipformer` recipe for commonvoice --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2024-04-09 11:37:08 +08:00
Yifan Yang	87843e9382	k2SSL: a Faster and Better Framework for Self-Supervised Speech Representation Learning (#1500 ) * Add k2SSL * fix flake8 * fix for black * fix for black * fix for black * Update ssl_datamodule.py * Fix bugs in HubertDataset * update comments * add librilight * add checkpoint convert script * format --------- Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local> Co-authored-by: zzasdf <15218404468@163.com>	2024-04-04 23:29:16 +08:00
Fangjun Kuang	c45e9fecfb	support torch 2.2.2 in docker images (#1578 )	2024-04-03 11:26:24 +08:00
Wei Kang	9369c2bef9	Add comments to prepare.sh in aidatatang (#1575 )	2024-04-02 16:08:09 +08:00
Dadoou	6cbddaa8e3	Add base choice to model_name argument for whisper model. (#1573 ) Co-authored-by: dadoou <dadoou@yandex.com>	2024-04-02 09:47:38 +08:00
Wei Kang	42de459110	Fix decoding finetune model (#1568 )	2024-03-26 10:38:21 +08:00

1 2 3 4 5 ...

1114 Commits