icefall

Author	SHA1	Message	Date
Yuekai Zhang	341c29e6e2	fix whisper version to support multi batch beam	2024-01-31 14:02:39 +08:00
Yuekai Zhang	c19891ee8e	add remove long short	2024-01-31 14:02:39 +08:00
Yuekai Zhang	bb07b65e45	add remove long short	2024-01-31 14:02:39 +08:00
Yuekai Zhang	1600f7db95	fix too long audios	2024-01-31 14:02:39 +08:00
Yuekai Zhang	b76cd65abf	fix subsampling factor	2024-01-31 14:02:39 +08:00
Yuekai Zhang	ad796d929d	remove useless file	2024-01-31 14:02:39 +08:00
Yuekai Zhang	e49534f2dd	add monkey patch codes	2024-01-31 14:02:39 +08:00
Yuekai Zhang	e1a55b945b	add wenetspeech fine-tune scripts	2024-01-31 14:02:39 +08:00
Yuekai Zhang	baa7c5fb8d	use multi machines	2024-01-31 14:02:39 +08:00
Yuekai Zhang	cf85019290	parallel jobs	2024-01-31 14:02:39 +08:00
Yuekai Zhang	df54121c41	fix io issue	2024-01-31 14:02:39 +08:00
Yuekai Zhang	af29455c3d	add kaldifeatwhisper fbank	2024-01-31 14:02:39 +08:00
Yuekai Zhang	08db3051ad	regression	2024-01-31 14:02:39 +08:00
Yuekai Zhang	f66b266aa4	fix executor	2024-01-31 14:02:39 +08:00
Yuekai Zhang	e46e9b77ee	fix overwrite	2024-01-31 14:02:39 +08:00
Yuekai Zhang	fd77c5758c	change compute feature batch	2024-01-31 14:02:39 +08:00
Yuekai Zhang	f4cf9fb2d3	add aishell2 feat	2024-01-31 14:02:39 +08:00
Yuekai Zhang	aa7b17e410	test feature extractor speed	2024-01-31 14:02:39 +08:00
Yuekai Zhang	d1b010463c	add original model decode with 30s	2024-01-31 14:02:39 +08:00
Yuekai Zhang	38f5f45c67	add requirments.txt	2024-01-31 14:02:39 +08:00
Yuekai Zhang	72c9d01724	add decode for wenetspeech	2024-01-31 14:02:39 +08:00
Yuekai Zhang	046e071ca3	add str to bool	2024-01-31 14:02:39 +08:00
Yuekai Zhang	315175a362	add whisper fbank for other dataset	2024-01-31 14:02:39 +08:00
Yuekai Zhang	e43c4da91d	add whisper fbank for wenetspeech	2024-01-31 14:02:39 +08:00
zr_jin	37b975cac9	fixed a CI test for `wenetspeech` (#1476 ) * Comply to issue #1149 https://github.com/k2-fsa/icefall/issues/1149	2024-01-27 06:41:56 +08:00
Yuekai Zhang	1c30847947	Whisper Fine-tuning Recipe on Aishell1 (#1466 ) * add decode seamlessm4t * add requirements * add decoding with avg model * add token files * add custom tokenizer * support deepspeed to finetune large model * support large-v3 * add model saving * using monkey patch to replace models * add manifest dir option	2024-01-27 00:32:30 +08:00
Fangjun Kuang	8d39f9508b	Fix torchscript export to use tokens.txt instead of lang_dir (#1475 )	2024-01-26 19:18:33 +08:00
Zengwei Yao	c401a2646b	minor fix of zipformer/optim.py (#1474 )	2024-01-26 15:50:11 +08:00
zr_jin	9c494a3329	typos fixed (#1472 )	2024-01-25 18:41:43 +08:00
Yifan Yang	559ed150bb	Fix typo (#1471 )	2024-01-23 22:51:09 +08:00
zr_jin	ebe97a07b0	Reworked README.md (#1470 ) * Rework README.md Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2024-01-23 16:26:24 +08:00
Yifan Yang	5dfc3ed7f9	Fix buffer size of DynamicBucketingSampler (#1468 ) * Fix buffer size * Fix for flake8 --------- Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>	2024-01-21 02:10:42 +08:00
zr_jin	7bdde9174c	A Zipformer recipe with Byte-level BPE for Aishell-1 (#1464 ) * init commit * Update train.py * Update decode.py * Update RESULTS.md * added `vocab_size` * removed unused softlinks * added scripts for testing pretrained models * set `bpe_model` as required * re-org the bbpe recipe for aishell	2024-01-16 21:08:35 +08:00
Fangjun Kuang	398401ed27	Update kaldifeat installation doc (#1460 )	2024-01-14 14:38:41 +08:00
Xiaoyu Yang	e2fcb42f5f	fix typo (#1455 )	2024-01-09 15:41:37 +08:00
zr_jin	5445ea6df6	Use shuffled LibriSpeech cuts instead (#1450 ) * use shuffled LibriSpeech cuts instead * leave the old code in comments for reference	2024-01-08 15:09:21 +08:00
zr_jin	b9b56eb879	Minor fixes to the VCTK data prep scripts (#1441 ) * Update prepare.sh	2024-01-08 14:28:07 +08:00
Karel Vesely	716b82cc3a	streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448 ) - some AudioTransform classes produce audio signals out of range [-1,+1] - Resample produced 1.0079 - The range [-10,+10] was chosen to still be able to reliably distinguish from the [-32k,+32k] signal... - this is related to : https://github.com/lhotse-speech/lhotse/issues/1254	2024-01-05 10:21:27 +08:00
Fangjun Kuang	8136ad775b	Use high_freq -400 in computing fbank features. (#1447 ) See also https://github.com/k2-fsa/sherpa-onnx/issues/514	2024-01-04 13:59:32 +08:00
zr_jin	f42258caf8	Update compute_fbank_commonvoice_splits.py (#1437 )	2023-12-30 13:03:26 +08:00
Fangjun Kuang	140e6381ad	Refactor CI tests for librispeech (#1436 )	2023-12-27 13:21:14 +08:00
Fangjun Kuang	db52fe2349	Refactor CI test for aishell (#1435 )	2023-12-26 20:29:43 +08:00
Fangjun Kuang	835a92eba5	Add doc about how to use the CPU-only docker images (#1432 )	2023-12-25 20:23:56 +08:00
Ali Haznedaroğlu	ddd7131317	Update TTS export-onnx.py scripts for handling variable token counts (#1430 )	2023-12-25 19:44:07 +08:00
Fangjun Kuang	c855a58cfd	Generate the dependency matrix by code for GitHub Actions (#1431 )	2023-12-25 19:41:09 +08:00
Fangjun Kuang	e5bb1ae86c	Use the CPU docker in CI to simplify the test code (#1427 )	2023-12-24 13:40:33 +08:00
Fangjun Kuang	79a42148db	Add CI test to cover zipformer/train.py (#1424 )	2023-12-23 00:38:36 +08:00
TianHao Zhang	702d4f5914	Update prepare.sh (#1422 ) fix the bug in line 251: 1、 del the additional blank 2、correct the spell error of "new_vocab_size"	2023-12-21 14:42:33 +08:00
zr_jin	10a234709c	bugs fixed (#1416 )	2023-12-14 11:26:37 +08:00
Fangjun Kuang	f85f0252a9	Add greedy search for streaming zipformer CTC. (#1415 )	2023-12-13 17:34:12 +08:00

1 2 3 4 5 ...

1034 Commits