icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Desh Raj	9c2172c1c4	Zipformer for TedLium (#1125 ) * initial commit for zipformer tedlium * fix unk decoding * add pretrained model and logs * update for new AsrModel * add option for choosing rnnt type * add results with modified rnnt	2023-06-28 16:43:49 +08:00
Wei Kang	219bba1310	zipformer wenetspeech (#1130 ) * copy files * update train.py * small fixes * Add decode.py * Fix dataloader in decode.py * add blank penalty * Add blank-penalty to other decoding method * Minor fixes * add zipformer2 recipe * Minor fixes * Remove pruned7 * export and test models * Replace bpe with tokens in export.py and pretrain.py * Minor fixes * Minor fixes * Minor fixes * Fix export * Update results * Fix zipformer-ctc * Fix ci * Fix ci * Fix CI * Fix CI --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2023-06-26 09:33:18 +08:00
Zengwei Yao	0ad037d076	Add CTC loss option in zipformer recipe (#1111 ) * add CTC loss option in zipformer recipe * add ctc_decode.py * support CTC model export, add jit_pretrained_ctc.py, pretrained_ctc.py * update README.md and RESULTS.md * add CI test	2023-06-14 14:27:29 +08:00
Wei Kang	ba257efbcd	Add Context biasing (#1038 ) * Add context biasing for librispeech * Add context biasing for wenetspeech * fix bugs * Implement Aho-Corasick context graph * fix some bugs * Fixes to forward_one_step; add draw to context graph * add output arc; fix black * Fix wenetspeech tokenizer * Minor fixes to the decode.py	2023-06-03 21:28:49 +08:00
Zengwei Yao	a7e142b7ff	Support long audios recognition (#980 ) * support long file transcription * rename recipe as long_file_recog * add docs * support multi-gpu decoding * style fix	2023-05-19 20:27:55 +08:00
Wei Kang	80156dda09	Training with byte level BPE (AIShell) (#986 ) * copy files from zipformer librispeech * Add byte bpe training for aishell * compile LG graph * Support LG decoding * Minor fixes * black * Minor fixes * export & fix pretrain.py * fix black * Update RESULTS.md * Fix export.py	2023-05-04 19:16:17 +08:00
marcoyang1998	45c13e90e4	RNNLM rescore + Low-order density ratio (#1017 ) * add rnnlm rescore + LODR * add LODR in decode.py * update RESULTS	2023-04-24 15:00:02 +08:00
marcoyang1998	34d1b07c3d	Modified beam search with RNNLM rescoring (#1002 ) * add RNNLM rescore * add shallow fusion and lm rescore for streaming zipformer * minor fix * update RESULTS.md * fix yesno workflow, change from ubuntu-18.04 to ubuntu-latest	2023-04-17 16:43:00 +08:00
marcoyang1998	d337398d29	Shallow fusion for Aishell (#954 ) * add shallow fusion and LODR for aishell * update RESULTS * add save by iterations	2023-04-03 16:20:29 +08:00
Zengwei Yao	2a5a75cb56	add option of using full attention for streaming model decoding (#975 )	2023-03-30 14:30:13 +08:00
Zengwei Yao	bcc5923ab9	Support batch-wise forced-alignment (#970 ) * support batch-wise forced-alignment based on beam search * add length_norm to HypothesisList.topk() * Use Hypothesis and HypothesisList instead	2023-03-28 23:24:24 +08:00
marcoyang1998	9ddd811925	Fix padding_idx (#942 ) * fix padding_idx * update RESULTS.md	2023-03-10 14:37:28 +08:00
Fangjun Kuang	f5de2e90c6	Fix style issues. (#937 )	2023-03-08 22:56:04 +08:00
pehonnet	07243d136a	remove key from result filename (#936 ) Co-authored-by: pe-honnet <pe.honnet@telepathy.ai>	2023-03-08 21:06:07 +08:00
marcoyang1998	c51e6c5b9c	fix typo (#916 )	2023-02-20 19:04:57 +08:00
Fangjun Kuang	8d3810e289	Simplify ONNX export (#881 ) * Simplify ONNX export * Fix ONNX CI tests	2023-02-07 15:01:59 +08:00
marcoyang1998	1f0408b103	Support Transformer LM (#750 ) * support transformer LM * show number of parameters during training * update docstring * testing files for ppl calculation * add lm wrampper for rnn and transformer LM * apply lm wrapper in lm shallow fusion * small updates * update decode.py to support LM fusion and LODR * add export.py * update CI and workflow * update decoding results * fix CI * remove transformer LM from CI test	2022-12-29 10:53:36 +08:00
Daniil	b293db4baf	Tedlium3 conformer ctc2 (#696 ) * modify preparation * small refacor * add tedlium3 conformer_ctc2 * modify decode * filter unk in decode * add scaling converter * address comments * fix lambda function lhotse * add implicit manifest shuffle * refactor ctc_greedy_search * import model arguments from train.py * style fix * fix ci test and last style issues * update RESULTS * fix RESULTS numbers * fix label smoothing loss * update model parameters number in RESULTS	2022-12-13 16:13:26 +08:00
Fangjun Kuang	6533f359c9	Fix CI (#726 ) * Fix CI * Disable shuffle for yesno. See https://github.com/k2-fsa/icefall/issues/197	2022-12-02 10:53:06 +08:00
marcoyang1998	4b5bc480e8	Add low-order density ratio in RNNLM shallow fusion (#678 ) * Support LODR in RNNLM shallow fusion * fix style * fix code style * update workflow and CI * update results * propagate changes to stateless3 * add decoding results for stateless3+giga * fix CI	2022-11-30 17:26:05 +08:00
huangruizhe	6693d907d3	shuffle full Librispeech data (#574 ) * shuffled full/partial librispeech data * fixed the code style issue * Shuffled full librispeech data off-line * Fixed style, addressed comments, and removed redandunt codes * Used the suggested version of black * Propagated the changes to other folders for librispeech (except conformer_mmi and streaming_conformer_ctc)	2022-11-27 11:26:09 +08:00
Desh Raj	d31db01037	manual correction of black formatting	2022-11-17 14:18:05 -05:00
Desh Raj	107df3b115	apply black on all files	2022-11-17 09:42:17 -05:00
Fangjun Kuang	60317120ca	Revert "Apply new Black style changes"	2022-11-17 20:19:32 +08:00
Desh Raj	d110b04ad3	apply new black formatting to all files	2022-11-16 13:06:43 -05:00
Fangjun Kuang	cedf9aa24f	Fix shallow fusion and add CI tests for it (#676 ) * Fix shallow fusion and add CI tests for it * Fix -1 index in embedding introduced in the zipformer PR	2022-11-13 11:51:00 +08:00
Fangjun Kuang	7e82f87126	Add Zipformer from Dan (#672 )	2022-11-12 18:11:19 +08:00
Fangjun Kuang	e334e570d8	Filter utterances with number_tokens > number_feature_frames. (#604 )	2022-11-12 07:57:58 +08:00
Zengwei Yao	32de2766d5	Refactor getting timestamps in fsa-based decoding (#660 ) * refactor getting timestamps for fsa-based decoding * fix doc * fix bug	2022-11-05 22:36:06 +08:00
Zengwei Yao	3600ce1b5f	Apply delay penalty on transducer (#654 ) * add delay penalty * fix CI * fix CI	2022-11-04 16:10:09 +08:00
marcoyang	a2d7095c1c	resolve conflicts	2022-11-04 11:37:42 +08:00
marcoyang	bdaeaae1ae	resolve conflicts	2022-11-04 11:25:10 +08:00
Wei Kang	64aed2cdeb	Fix LG log file name (#657 )	2022-11-03 23:12:35 +08:00
Wei Kang	163d929601	Add fast_beam_search_LG (#622 ) * Add fast_beam_search_LG * add fast_beam_search_LG to commonly used recipes * fix ci * fix ci * Fix error	2022-11-03 16:29:30 +08:00
marcoyang	2a52b8c125	update docs	2022-11-03 11:10:21 +08:00
marcoyang	6c8d1f9ef5	update	2022-11-02 17:48:58 +08:00
marcoyang	0a46a39e24	update decoding commands	2022-11-02 17:25:31 +08:00
marcoyang	63d0a52dbd	support RNNLM shallow fusion in stateless5	2022-11-02 16:37:29 +08:00
marcoyang	de2f5e3e6d	support RNNLM shallow fusion for LSTM transducer	2022-11-02 16:15:56 +08:00
Wei Kang	d389524d45	remove tail padding for non-streaming models (#625 )	2022-11-01 11:09:56 +08:00
Zengwei Yao	03668771d7	Get timestamps during decoding (#598 ) * print out timestamps during decoding * add word-level alignments * support to compute mean symbol delay with word-level alignments * print variance of symbol delay * update doc * support to compute delay for pruned_transducer_stateless4 * fix bug * add doc	2022-11-01 10:24:00 +08:00
ezerhouni	9b671e1c21	Add Shallow fusion in modified_beam_search (#630 ) * Add utility for shallow fusion * test batch size == 1 without shallow fusion * Use shallow fusion for modified-beam-search * Modified beam search with ngram rescoring * Fix code according to review Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2022-10-21 16:44:56 +08:00
Fangjun Kuang	d1f16a04bd	fix type hints for decode.py (#623 )	2022-10-18 06:56:12 +08:00
Zengwei Yao	aa58c2ee02	Modify ActivationBalancer for speed (#612 ) * add a probability to apply ActivationBalancer * minor fix * minor fix	2022-10-13 15:14:28 +08:00
Fangjun Kuang	1c07d2fb37	Remove all-in-one for onnx export (#614 ) * Remove all-in-one for onnx export * Exit on error for CI	2022-10-12 10:34:06 +08:00
Zengwei Yao	f3ad32777a	Gradient filter for training lstm model (#564 ) * init files * add gradient filter module * refact getting median value * add cutoff for grad filter * delete comments * apply gradient filter in LSTM module, to filter both input and params * fix typing and refactor * filter with soft mask * rename lstm_transducer_stateless2 to lstm_transducer_stateless3 * fix typos, and update RESULTS.md * minor fix * fix return typing * fix typo	2022-09-29 11:15:43 +08:00
LIyong.Guo	923b60a7c6	padding zeros (#591 )	2022-09-28 21:20:33 +08:00
marcoyang1998	1e31fbcd7d	Add clamping operation in Eve optimizer for all scalar weights to avoid (#550 ) non stable training in some scenarios. The clamping range is set to (-10,2). Note that this change may cause unexpected effect if you resume training from a model that is trained without clamping.	2022-08-25 12:12:50 +08:00
Zengwei Yao	f2f5baf687	Use ScaledLSTM as streaming encoder (#479 ) * add ScaledLSTM * add RNNEncoderLayer and RNNEncoder classes in lstm.py * add RNN and Conv2dSubsampling classes in lstm.py * hardcode bidirectional=False * link from pruned_transducer_stateless2 * link scaling.py pruned_transducer_stateless2 * copy from pruned_transducer_stateless2 * modify decode.py pretrained.py test_model.py train.py * copy streaming decoding files from pruned_transducer_stateless2 * modify streaming decoding files * simplified code in ScaledLSTM * flat weights after scaling * pruned2 -> pruned4 * link __init__.py * fix style * remove add_model_arguments * modify .flake8 * fix style * fix scale value in scaling.py * add random combiner for training deeper model * add using proj_size * add scaling converter for ScaledLSTM * support jit trace * add using averaged model in export.py * modify test_model.py, test if the model can be successfully exported by jit.trace * modify pretrained.py * support streaming decoding * fix model.py * Add cut_id to recognition results * Add cut_id to recognition results * do not pad in Conv subsampling module; add tail padding during decoding. * update RESULTS.md * minor fix * fix doc * update README.md * minor change, filter infinite loss * remove the condition of raise error * modify type hint for the return value in model.py * minor change * modify RESULTS.md Co-authored-by: pkufool <wkang.pku@gmail.com>	2022-08-19 14:38:45 +08:00
marcoyang1998	c74cec59e9	propagate changes from #525 to other librispeech recipes (#531 ) * propaga changes from #525 to other librispeech recipes * refactor display_and_save_batch to utils * fixed typo * reformat code style	2022-08-17 17:18:15 +08:00

1 2 3 4

183 Commits