icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Xiaoyu Yang	5952972294	Keep the custom fields in libriheavy manifest (#1719 )	2024-08-17 13:24:38 +08:00
Yifan Yang	6ac3343ce5	fix path in README.md (#1722 )	2024-08-16 20:13:02 +08:00
Karel Vesely	1730fce688	split `save_results()` -> `save_asr_output()` + `save_wer_results()` (#1712 ) - the idea is to support `--skip-scoring` argument passed to a decoding script - created for Transducer decoding (non-streaming, streaming) - it can be done also for CTC decoding... (not yet) - also added `--label` for extra label in `streaming_decode.py` - and also added `set_caching_enabled(True)`, which has no effect on librispeech, but it leads to faster runtime on DBs with long recordings (assuming `librispeech/zipformer` scripts are the example scripts for other setups)	2024-08-13 23:02:14 +08:00
Fangjun Kuang	3b257dd5ae	Add docker images for torch 2.4 (#1704 )	2024-07-25 16:46:24 +08:00
Yuekai Zhang	4af81af5a6	Update Zipformer-xl 700M Results on multi-hans-zh (#1694 ) * add blank penalty * update zipformer-xl results * fix typo	2024-07-18 21:05:59 +08:00
zzasdf	11151415f3	fix error in accum_grad (#1693 )	2024-07-17 17:47:43 +08:00
Fangjun Kuang	2e13298717	Refactor ctc greedy search. (#1691 ) Use torch.unique_consecutive() to avoid reinventing the wheel.	2024-07-15 12:01:47 +08:00
Zengwei Yao	d47c078286	add decoding method of ctc-greedy-search in zipformer recipe (#1690 )	2024-07-14 17:30:13 +08:00
Zengwei Yao	334beed2af	fix usages of returned losses after adding attention-decoder in zipformer (#1689 )	2024-07-12 16:50:58 +08:00
Ziwei Li	f6febd658e	"-" replace "_" fix writing error (#1687 )	2024-07-12 14:42:00 +08:00
Teo Wen Shen	19048e155b	Cast grad_scale in whiten to float (#1663 ) * cast grad_scale in whiten to float * fix cast in zipformer_lora	2024-07-11 15:12:30 +08:00
Yifan Yang	d65187ec52	Small fix (#1686 )	2024-07-11 14:45:35 +08:00
Zengwei Yao	785f3f0bcf	Update RESULTS.md, adding results and model links of zipformer-small/medium CTC/AED models (#1683 )	2024-07-09 20:04:47 +08:00
Yuekai Zhang	1c3d992a39	Update results using Zipformer-large on multi-hans-zh (#1679 )	2024-07-09 09:57:52 +08:00
zr_jin	2d64228efa	Update attention_decoder.py (#1681 )	2024-07-06 09:01:34 +08:00
zr_jin	325a825841	Update requirements-ci.txt (#1682 )	2024-07-06 09:01:19 +08:00
Zengwei Yao	f76afff741	Support CTC/AED option for Zipformer recipe (#1389 ) * add attention-decoder loss option for zipformer recipe * add attention-decoder-rescoring * update export.py and pretrained_ctc.py * update RESULTS.md	2024-07-05 20:19:18 +08:00
Yifan Yang	cbcac23d26	Fix typos, remove unused packages, normalize comments (#1678 )	2024-07-04 14:19:45 +08:00
Yuekai Zhang	ebbd396c2b	update multi-hans-zh whisper-qwen-7b results (#1677 ) * update qwen-7b whisper encoder results * update qwen-7b whisper encoder results * fix typo	2024-07-03 19:55:12 +08:00
Manix	eaab2c819f	Zipformer Onnx FP16 (#1671 ) Signed-off-by: manickavela29 <manickavela1998@gmail.com>	2024-06-27 16:08:24 +08:00
Fangjun Kuang	b594a3875b	Add CI for non-streaming zipformer about ksponspeech (#1667 )	2024-06-24 16:20:46 +08:00
Seung Hyun Lee	031f892796	Reformat by black non-streaming zipformer recipe for ksponspeech (#1665 )	2024-06-24 15:28:09 +08:00
Seung Hyun Lee	6f102d3470	Add non-streaming Zipformer recipe for KsponSpeech (#1664 )	2024-06-24 14:07:37 +08:00
Fangjun Kuang	3059eb4511	Fix doc URLs (#1660 )	2024-06-21 11:10:14 +08:00
Yuekai Zhang	ff2bef9e50	update multi-hans whisper-qwen-1.5b results (#1657 )	2024-06-19 11:10:31 +08:00
Seung Hyun Lee	2e05663fbb	Add prepare.sh for KsponSpeech recipe. (#1656 )	2024-06-18 16:54:39 +08:00
Fangjun Kuang	1f5c0a87b9	Add CI for ksponspeech (#1655 )	2024-06-16 19:15:09 +08:00
Seung Hyun Lee	c13c7aa30b	Add Streaming Zipformer-Transducer recipe for KsponSpeech (#1651 )	2024-06-16 16:20:44 +08:00
Yuekai Zhang	890eeec82c	Add qwen-audio style model training: using whisper + qwen2 (#1652 )	2024-06-16 12:14:44 +08:00
Triplecq	3b40d9bbb1	Zipformer recipe for ReazonSpeech (#1611 ) * Add first cut at ReazonSpeech recipe This recipe is mostly based on egs/csj, but tweaked to the point that can be run with ReazonSpeech corpus. Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net> --------- Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net> Co-authored-by: Fujimoto Seiji <fujimoto@ceptord.net> Co-authored-by: Chen <qc@KDM00.cm.cluster> Co-authored-by: root <root@KDA01.cm.cluster>	2024-06-13 14:19:03 +08:00
Yuekai Zhang	d5be739639	add distill whisper results (#1648 )	2024-06-13 00:20:04 +08:00
Fangjun Kuang	13f55d0735	Add merge_tokens for ctc forced alignment (#1649 )	2024-06-12 17:45:13 +08:00
Fangjun Kuang	ec0389a3c1	Add doc about FST-based CTC forced alignment. (#1482 )	2024-06-12 17:36:57 +08:00
Daniel Povey	4d5c1f2e60	Remove inf from stored stats (#1647 )	2024-06-10 22:41:54 +08:00
Fangjun Kuang	130a18cc10	support torch 2.3.1 in docker (#1646 )	2024-06-06 22:27:29 +08:00
Fangjun Kuang	b88062292b	Typo fixes (#1643 )	2024-06-03 16:49:21 +08:00
zr_jin	42a97f6d7b	Update env.py (#1635 )	2024-05-22 22:29:38 +08:00
zr_jin	1adf1e441d	Removed unused ``k2`` dependencies from the AT recipe (#1633 )	2024-05-21 18:22:19 +08:00
Zengwei Yao	0df406c5da	Initialize BiasNorm bias with small random values (#1630 )	2024-05-20 22:32:02 +08:00
zr_jin	68980c5d0a	Fix an error occured during mmi preparation (#1626 ) * init commit * updated	2024-05-17 19:45:15 +08:00
zr_jin	9d570870cf	Update asr_datamodule.py (#1619 )	2024-05-07 21:37:55 +08:00
Yifan Yang	4e97b19b63	Remove duplicate logging initialization logic in utils.py (#1617 )	2024-05-06 13:00:27 +08:00
Zengwei Yao	c08fe48603	add force=True to logging.basicConfig (#1613 )	2024-05-04 11:42:23 +08:00
Yuekai Zhang	6d7c1d13a5	update speechio whisper ft results (#1605 ) * update speechio whisper ft results	2024-04-30 11:49:20 +08:00
Wei Kang	b49351fc39	Update README.md for conformer-ctc (#1609 )	2024-04-28 09:56:13 +08:00
Dongji Gao	9a17f4ce41	add OTC related scripts using phone as units instead of BPEs (#1602 ) * add otc related scripts using phone instead of bpe	2024-04-26 00:55:44 +08:00
zzasdf	25cabb7663	fix error in padding computing (#1607 )	2024-04-25 22:40:07 +08:00
Xiaoyu Yang	df36f93bd8	add small-scaled model for audio tagging (#1604 )	2024-04-24 17:00:42 +08:00
Yifan Yang	368b7d10a7	clear log handlers before setup (#1603 )	2024-04-24 15:31:25 +09:00
zr_jin	9f8f0bceb5	Update prepare.sh (#1601 )	2024-04-20 23:02:02 +09:00

1 2 3 4 5 ...

1225 Commits