icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
marcoyang1998	2f4eb18466	merge from master	2023-09-15 10:28:11 +08:00
marcoyang1998	1bd6be03c1	minor updates	2023-09-15 09:56:42 +08:00
marcoyang1998	cb85d4c337	remove unused scripts	2023-09-15 09:55:34 +08:00
zr_jin	565d2c2f5b	Minor fixes to the libricss recipe (#1256 )	2023-09-15 02:37:53 +08:00
marcoyang1998	66ac3a4ecc	removed un-used files	2023-09-14 18:38:44 +08:00
marcoyang1998	84ff2ab67c	add text normalization for librispeech test sets	2023-09-14 18:36:09 +08:00
marcoyang1998	f9ef9f38eb	support computing CER, writing character level transcript	2023-09-14 18:31:18 +08:00
docterstrange	fba1710622	modify tal_csasr recipe (#1252 ) Co-authored-by: zss11 <zss11@d3-hpc-sjtu-test-001.cm.cluster>	2023-09-14 09:58:28 +08:00
zr_jin	7cc2dae940	Fixes to incorporate with the latest Lhotse release (#1249 )	2023-09-13 12:39:49 +08:00
zr_jin	0f1bc6f8af	Multi_zh-Hans Recipe (#1238 ) * Init commit for recipes trained on multiple zh datasets. * fbank extraction for thchs30 * added support for aishell1 * added support for aishell-2 * fixes * fixes * fixes * added support for stcmds and primewords * fixes * added support for magicdata script for fbank computation not done yet * added script for magicdata fbank computation * file permission fixed * updated for the wenetspeech recipe * updated * Update preprocess_kespeech.py * updated * updated * updated * updated * file permission fixed * updated paths * fixes * added support for kespeech dev/test set fbank computation * fixes for file permission * refined support for KeSpeech * added scripts for BPE model training * updated * init commit for the multi_zh-cn zipformer recipe * disable speed perturbation by default * updated * updated * added necessary files for the zipformer recipe * removed redundant wenetspeech M and S sets * updates for multi dataset decoding * refined * formatting issues fixed * updated * minor fixes * this commit finalize the recipe (hopefully) * fixed formatting issues * minor fixes * updated * using soft links to reduce redundancy * minor updates * using soft links to reduce redundancy * minor updates * minor updates * using soft links to reduce redundancy * minor updates * Update README.md * minor updates * Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * minor updates * minor fixes * fixed a formatting issue * Update preprocess_kespeech.py * Update prepare.sh * Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * removed redundant files * symlinks added * minor updates * added CI tests for `multi_zh-hans` * minor fixes * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2023-09-13 11:57:05 +08:00
zr_jin	3199058194	enable `sclite_mode` for swbd scoring (#1239 )	2023-09-09 21:25:26 +08:00
marcoyang1998	81af525de4	update the biasing lists	2023-09-08 10:15:21 +08:00
marcoyang1998	bbf1577818	add long audio transcription scripts	2023-09-08 10:02:41 +08:00
marcoyang1998	07e27348dd	more updates	2023-09-08 10:01:48 +08:00
marcoyang1998	013cafdd6d	updates	2023-09-08 10:00:00 +08:00
marcoyang1998	522273f97e	change the text normalization for upper_case_no_punc	2023-09-08 09:57:24 +08:00
marcoyang1998	77890a6115	add context biasing at different levels	2023-09-08 09:56:45 +08:00
marcoyang1998	d4c5a1c157	updates	2023-09-08 09:55:41 +08:00
zr_jin	49a4b67288	fixed a CI test issue related to python version (#1243 )	2023-09-07 19:48:46 +08:00
zr_jin	c912bd65d0	Update run-gigaspeech-pruned-transducer-stateless2-2022-05-12.sh (#1242 )	2023-09-07 18:48:27 +08:00
zr_jin	d50a9ea030	doc str fixes (#1241 )	2023-09-07 16:34:53 +08:00
zr_jin	9ef8145fa3	minor fixes (#1240 )	2023-09-04 17:56:05 +08:00
Desh Raj	8fcadb68a7	Missing definitions in scaling.py added (#1232 )	2023-08-31 10:31:05 +08:00
marcoyang1998	3a1ce5963b	Minor fix for documentation (#1229 )	2023-08-29 16:39:48 +08:00
marcoyang1998	cad01bfcb6	add subformer model with style embeddings	2023-08-29 16:04:51 +08:00
marcoyang1998	16e8907805	update text normalization for librispeech test sets	2023-08-29 16:03:56 +08:00
Wei Kang	4d7f73ce65	Add context biasing for zipformer recipe (#1204 ) * Add context biasing for zipformer recipe * support context biasing in modified_beam_search_LODR * fix context graph * Minor fixes	2023-08-28 19:37:32 +08:00
marcoyang1998	80c54c05e2	support showing WERs of different books	2023-08-17 23:59:37 +08:00
marcoyang1998	f23882b9f6	also sample from distractors when using separate words in the ref text; increase the max length of substring	2023-08-17 12:11:33 +08:00
Fangjun Kuang	fc2df07841	Add icefall tutorials for dummies. (#1220 )	2023-08-16 22:32:41 +08:00
marcoyang1998	8a238317a4	support using subformer as text encoder and train with style	2023-08-16 19:08:36 +08:00
marcoyang1998	73fa1651f0	minor updates to utils.py	2023-08-16 16:47:23 +08:00
marcoyang1998	2091bb5f25	add two pass decoding	2023-08-16 16:46:50 +08:00
marcoyang1998	0982db9cde	add a few args to support context list and rare words	2023-08-16 16:44:58 +08:00
marcoyang1998	4420788f66	support using context list and random substring as pre text	2023-08-16 16:44:29 +08:00
marcoyang1998	17d0918969	fix the post normalization bug, avoid multiple words	2023-08-16 09:39:42 +08:00
marcoyang1998	fdc4fcabb9	use a more aggresive sampling_weight	2023-08-16 09:38:40 +08:00
Erwan Zerhouni	9a47c08d08	Update padding modified beam search (#1217 )	2023-08-14 16:10:50 +02:00
marcoyang1998	ae4d2fbfcc	initial commit	2023-08-14 09:51:20 +08:00
zr_jin	3b5645f594	doc updated (#1214 )	2023-08-13 12:37:08 +08:00
Piotr Żelasko	b0e8a40c89	Speed up yesno training to finish in ~10s on CPU (#1215 )	2023-08-13 09:50:59 +08:00
Fangjun Kuang	dfccadc6b6	Fix a typo in export_onnx.py for yesno (#1213 )	2023-08-12 16:59:06 +08:00
zr_jin	a81396b482	Use tokens.txt to replace bpe.model (#1162 )	2023-08-12 16:53:59 +08:00
Fangjun Kuang	d6b28a11a7	Add export script for the yesno recipe. (#1212 )	2023-08-11 23:57:00 +08:00
zr_jin	74806b744b	disable speed perturbation by default (#1176 ) * disable speed perturbation by default * minor fixes * minor updates * updated bash scripts to incorporate with the `speed-perturb` arg * minor fixes 1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe >> `00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)` 2. changed arg type for `perturb-speed` to str2bool	2023-08-10 20:56:02 +08:00
Yifan Yang	00256a7669	Fix decode_stream.py (#1208 ) * FIx decode_stream.py * Update decode_stream.py	2023-08-09 09:40:58 +08:00
marcoyang1998	1ee251c8b3	Decode zipformer with external LMs (#1193 ) * update some documentation * support decoding with LMs in zipformer recipe * update RESULTS.md	2023-08-03 15:50:35 +08:00
Fangjun Kuang	bcabaf896c	Add doc describing how to run icefall within a docker container (#1194 )	2023-08-01 12:28:34 +08:00
Fangjun Kuang	375520d419	Run the yesno recipe with docker in GitHub actions (#1191 )	2023-07-28 15:43:08 +08:00
Fangjun Kuang	751bb6ff1a	Add docker image for icefall (#1189 )	2023-07-28 10:34:40 +08:00

1 2 3 4 5 ...

978 Commits