icefall

Author	SHA1	Message	Date
Yuekai Zhang	84e4af93d7	add whisper fine-tuning results	2024-01-17 16:17:32 +08:00
Yuekai Zhang	557b35cefc	clean codes	2024-01-15 20:40:44 +08:00
Yuekai Zhang	eea46458c5	revert asr data module	2024-01-15 19:59:48 +08:00
Yuekai Zhang	e883bb60d4	remove seamless for next PR	2024-01-15 19:51:43 +08:00
Yuekai Zhang	ac53222054	add model saving	2024-01-15 19:51:43 +08:00
Yuekai Zhang	2ce09809cd	support large-v3	2024-01-15 19:51:41 +08:00
Yuekai Zhang	fa7ad4dc72	update deepspeed model loading	2024-01-15 19:50:57 +08:00
Yuekai Zhang	b6418acda2	support deepspeed to finetune large model	2024-01-15 19:50:57 +08:00
Yuekai Zhang	92895f774f	clean up codes	2024-01-15 19:50:57 +08:00
Yuekai Zhang	98d11abedb	remove padding to 30s, compute validation loss once	2024-01-15 19:50:57 +08:00
Yuekai Zhang	07cefa82a7	change scaleadam to adamw	2024-01-15 19:50:55 +08:00
Yuekai Zhang	8b832f168d	update lhotse version	2024-01-15 19:49:50 +08:00
Yuekai Zhang	5bf3a9cfe0	using audio with any length	2024-01-15 19:49:50 +08:00
Yuekai Zhang	6c2cd5b4c3	support whisper ft	2024-01-15 19:49:26 +08:00
Yuekai Zhang	bb1c4466e3	rename train, train2, add support to fine-tune embedding table	2024-01-15 19:49:26 +08:00
Yuekai Zhang	d926585b10	fix loading	2024-01-15 19:49:26 +08:00
Yuekai Zhang	2a288fb9bf	add custom tokenizer	2024-01-15 19:49:26 +08:00
Yuekai Zhang	22ee287312	add token files	2024-01-15 19:49:26 +08:00
Yuekai Zhang	7e387dd54b	change vocab table	2024-01-15 19:49:26 +08:00
Yuekai Zhang	72e9a436b8	fix typo	2024-01-15 19:49:26 +08:00
Yuekai Zhang	cc6432443d	add decoding with avg model	2024-01-15 19:49:26 +08:00
Yuekai Zhang	5f399dc780	load checkpoint to decode	2024-01-15 19:49:26 +08:00
Yuekai Zhang	e81545714a	update decoding from checkpoint	2024-01-15 19:49:26 +08:00
Yuekai Zhang	0d6d8f9473	update fine-tuning lr	2024-01-15 19:49:26 +08:00
Yuekai Zhang	cbc3852876	add fairseq2 require	2024-01-15 19:49:26 +08:00
Yuekai Zhang	3a7ad277ad	add requirements	2024-01-15 19:49:26 +08:00
Yuekai Zhang	363c3f1f82	update finetuning codes	2024-01-15 19:49:26 +08:00
Yuekai Zhang	f99f4d7c92	add decode seamlessm4t	2024-01-15 19:49:26 +08:00
Fangjun Kuang	398401ed27	Update kaldifeat installation doc (#1460 )	2024-01-14 14:38:41 +08:00
Xiaoyu Yang	e2fcb42f5f	fix typo (#1455 )	2024-01-09 15:41:37 +08:00
zr_jin	5445ea6df6	Use shuffled LibriSpeech cuts instead (#1450 ) * use shuffled LibriSpeech cuts instead * leave the old code in comments for reference	2024-01-08 15:09:21 +08:00
zr_jin	b9b56eb879	Minor fixes to the VCTK data prep scripts (#1441 ) * Update prepare.sh	2024-01-08 14:28:07 +08:00
Karel Vesely	716b82cc3a	streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448 ) - some AudioTransform classes produce audio signals out of range [-1,+1] - Resample produced 1.0079 - The range [-10,+10] was chosen to still be able to reliably distinguish from the [-32k,+32k] signal... - this is related to : https://github.com/lhotse-speech/lhotse/issues/1254	2024-01-05 10:21:27 +08:00
Fangjun Kuang	8136ad775b	Use high_freq -400 in computing fbank features. (#1447 ) See also https://github.com/k2-fsa/sherpa-onnx/issues/514	2024-01-04 13:59:32 +08:00
zr_jin	f42258caf8	Update compute_fbank_commonvoice_splits.py (#1437 )	2023-12-30 13:03:26 +08:00
Fangjun Kuang	140e6381ad	Refactor CI tests for librispeech (#1436 )	2023-12-27 13:21:14 +08:00
Fangjun Kuang	db52fe2349	Refactor CI test for aishell (#1435 )	2023-12-26 20:29:43 +08:00
Fangjun Kuang	835a92eba5	Add doc about how to use the CPU-only docker images (#1432 )	2023-12-25 20:23:56 +08:00
Ali Haznedaroğlu	ddd7131317	Update TTS export-onnx.py scripts for handling variable token counts (#1430 )	2023-12-25 19:44:07 +08:00
Fangjun Kuang	c855a58cfd	Generate the dependency matrix by code for GitHub Actions (#1431 )	2023-12-25 19:41:09 +08:00
Fangjun Kuang	e5bb1ae86c	Use the CPU docker in CI to simplify the test code (#1427 )	2023-12-24 13:40:33 +08:00
Fangjun Kuang	79a42148db	Add CI test to cover zipformer/train.py (#1424 )	2023-12-23 00:38:36 +08:00
TianHao Zhang	702d4f5914	Update prepare.sh (#1422 ) fix the bug in line 251: 1、 del the additional blank 2、correct the spell error of "new_vocab_size"	2023-12-21 14:42:33 +08:00
zr_jin	10a234709c	bugs fixed (#1416 )	2023-12-14 11:26:37 +08:00
Fangjun Kuang	f85f0252a9	Add greedy search for streaming zipformer CTC. (#1415 )	2023-12-13 17:34:12 +08:00
zr_jin	d0da509055	Support ONNX export for Streaming CTC Encoder (#1413 ) * Create export-onnx-streaming-ctc.py * doc_str updated Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2023-12-13 10:33:28 +08:00
Fangjun Kuang	9e9fe7954d	Upload gigaspeech zipformer models in CI (#1412 )	2023-12-12 18:57:04 +08:00
Fangjun Kuang	20a82c9abf	first commit (#1411 )	2023-12-12 18:13:26 +08:00
Fangjun Kuang	b0f70c9d04	Fix torch.jit.script() export for pruned_transducer_stateless2 (#1410 )	2023-12-10 11:38:39 +08:00
zr_jin	df56aff31e	minor fixes to the vits onnx exportation scripts (#1408 )	2023-12-08 21:11:31 +08:00

1 2 3 4 5 ...

1029 Commits