icefall

Author	SHA1	Message	Date
marcoyang	838c24cba1	fix bug in decode	2023-10-07 16:04:53 +08:00
marcoyang	7c56d8f06b	fix a bug in samping function	2023-10-04 00:09:27 +08:00
marcoyang	e058ba0a65	minor updates	2023-09-27 11:31:14 +08:00
marcoyang	ae3149cb7f	freeze BERT option	2023-09-21 10:24:14 +08:00
marcoyang	21cc1dfff4	fix lhotse compatibility	2023-09-21 10:22:56 +08:00
marcoyang	974c1fff08	add freeze param in utils.py	2023-09-20 19:05:12 +08:00
marcoyang1998	fdff6b3b3a	add shared	2023-09-20 14:56:38 +08:00
marcoyang1998	9485587976	add RESULTS.md, pending model link	2023-09-20 11:45:13 +08:00
marcoyang1998	203cd5cf11	add usage in decoder_bert.py	2023-09-20 11:44:36 +08:00
marcoyang1998	cda6e06a85	updates	2023-09-20 10:35:37 +08:00
marcoyang1998	93461fb77e	add documentation to different text sampling function	2023-09-20 09:57:03 +08:00
marcoyang1998	6579800720	update	2023-09-19 18:38:56 +08:00
marcoyang1998	bea1bd295f	add script for generating context list for each utterance	2023-09-19 17:44:52 +08:00
marcoyang1998	8401f26342	update some documentation for cross-attention zipformer	2023-09-19 14:53:33 +08:00
marcoyang1998	58dc0430be	remove subformer scripts	2023-09-18 17:28:50 +08:00
marcoyang1998	d411ffb4b6	update	2023-09-15 16:08:27 +08:00
marcoyang1998	a0fe6bcd0d	further clean up	2023-09-15 11:13:51 +08:00
marcoyang1998	ae2c7c73f6	remove/rename files	2023-09-15 10:54:58 +08:00
marcoyang1998	2f4eb18466	merge from master	2023-09-15 10:28:11 +08:00
marcoyang1998	1bd6be03c1	minor updates	2023-09-15 09:56:42 +08:00
marcoyang1998	cb85d4c337	remove unused scripts	2023-09-15 09:55:34 +08:00
zr_jin	565d2c2f5b	Minor fixes to the libricss recipe (#1256 )	2023-09-15 02:37:53 +08:00
marcoyang1998	66ac3a4ecc	removed un-used files	2023-09-14 18:38:44 +08:00
marcoyang1998	84ff2ab67c	add text normalization for librispeech test sets	2023-09-14 18:36:09 +08:00
marcoyang1998	f9ef9f38eb	support computing CER, writing character level transcript	2023-09-14 18:31:18 +08:00
docterstrange	fba1710622	modify tal_csasr recipe (#1252 ) Co-authored-by: zss11 <zss11@d3-hpc-sjtu-test-001.cm.cluster>	2023-09-14 09:58:28 +08:00
zr_jin	7cc2dae940	Fixes to incorporate with the latest Lhotse release (#1249 )	2023-09-13 12:39:49 +08:00
zr_jin	0f1bc6f8af	Multi_zh-Hans Recipe (#1238 ) * Init commit for recipes trained on multiple zh datasets. * fbank extraction for thchs30 * added support for aishell1 * added support for aishell-2 * fixes * fixes * fixes * added support for stcmds and primewords * fixes * added support for magicdata script for fbank computation not done yet * added script for magicdata fbank computation * file permission fixed * updated for the wenetspeech recipe * updated * Update preprocess_kespeech.py * updated * updated * updated * updated * file permission fixed * updated paths * fixes * added support for kespeech dev/test set fbank computation * fixes for file permission * refined support for KeSpeech * added scripts for BPE model training * updated * init commit for the multi_zh-cn zipformer recipe * disable speed perturbation by default * updated * updated * added necessary files for the zipformer recipe * removed redundant wenetspeech M and S sets * updates for multi dataset decoding * refined * formatting issues fixed * updated * minor fixes * this commit finalize the recipe (hopefully) * fixed formatting issues * minor fixes * updated * using soft links to reduce redundancy * minor updates * using soft links to reduce redundancy * minor updates * minor updates * using soft links to reduce redundancy * minor updates * Update README.md * minor updates * Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * minor updates * minor fixes * fixed a formatting issue * Update preprocess_kespeech.py * Update prepare.sh * Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * removed redundant files * symlinks added * minor updates * added CI tests for `multi_zh-hans` * minor fixes * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh * Update run-multi-zh_hans-zipformer.sh --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2023-09-13 11:57:05 +08:00
zr_jin	3199058194	enable `sclite_mode` for swbd scoring (#1239 )	2023-09-09 21:25:26 +08:00
marcoyang1998	81af525de4	update the biasing lists	2023-09-08 10:15:21 +08:00
marcoyang1998	bbf1577818	add long audio transcription scripts	2023-09-08 10:02:41 +08:00
marcoyang1998	07e27348dd	more updates	2023-09-08 10:01:48 +08:00
marcoyang1998	013cafdd6d	updates	2023-09-08 10:00:00 +08:00
marcoyang1998	522273f97e	change the text normalization for upper_case_no_punc	2023-09-08 09:57:24 +08:00
marcoyang1998	77890a6115	add context biasing at different levels	2023-09-08 09:56:45 +08:00
marcoyang1998	d4c5a1c157	updates	2023-09-08 09:55:41 +08:00
zr_jin	49a4b67288	fixed a CI test issue related to python version (#1243 )	2023-09-07 19:48:46 +08:00
zr_jin	c912bd65d0	Update run-gigaspeech-pruned-transducer-stateless2-2022-05-12.sh (#1242 )	2023-09-07 18:48:27 +08:00
zr_jin	d50a9ea030	doc str fixes (#1241 )	2023-09-07 16:34:53 +08:00
zr_jin	9ef8145fa3	minor fixes (#1240 )	2023-09-04 17:56:05 +08:00
Desh Raj	8fcadb68a7	Missing definitions in scaling.py added (#1232 )	2023-08-31 10:31:05 +08:00
marcoyang1998	3a1ce5963b	Minor fix for documentation (#1229 )	2023-08-29 16:39:48 +08:00
marcoyang1998	cad01bfcb6	add subformer model with style embeddings	2023-08-29 16:04:51 +08:00
marcoyang1998	16e8907805	update text normalization for librispeech test sets	2023-08-29 16:03:56 +08:00
Wei Kang	4d7f73ce65	Add context biasing for zipformer recipe (#1204 ) * Add context biasing for zipformer recipe * support context biasing in modified_beam_search_LODR * fix context graph * Minor fixes	2023-08-28 19:37:32 +08:00
marcoyang1998	80c54c05e2	support showing WERs of different books	2023-08-17 23:59:37 +08:00
marcoyang1998	f23882b9f6	also sample from distractors when using separate words in the ref text; increase the max length of substring	2023-08-17 12:11:33 +08:00
Fangjun Kuang	fc2df07841	Add icefall tutorials for dummies. (#1220 )	2023-08-16 22:32:41 +08:00
marcoyang1998	8a238317a4	support using subformer as text encoder and train with style	2023-08-16 19:08:36 +08:00
marcoyang1998	73fa1651f0	minor updates to utils.py	2023-08-16 16:47:23 +08:00

1 2 3 4 5 ...

946 Commits