icefall

Author	SHA1	Message	Date
Fangjun Kuang	97b3fc53aa	Add LSTM for the multi-dataset setup. (#558 ) * Add LSTM for the multi-dataset setup. * Add results * fix style issues * add missing file	2022-09-16 18:40:25 +08:00
Zengwei Yao	f2f5baf687	Use ScaledLSTM as streaming encoder (#479 ) * add ScaledLSTM * add RNNEncoderLayer and RNNEncoder classes in lstm.py * add RNN and Conv2dSubsampling classes in lstm.py * hardcode bidirectional=False * link from pruned_transducer_stateless2 * link scaling.py pruned_transducer_stateless2 * copy from pruned_transducer_stateless2 * modify decode.py pretrained.py test_model.py train.py * copy streaming decoding files from pruned_transducer_stateless2 * modify streaming decoding files * simplified code in ScaledLSTM * flat weights after scaling * pruned2 -> pruned4 * link __init__.py * fix style * remove add_model_arguments * modify .flake8 * fix style * fix scale value in scaling.py * add random combiner for training deeper model * add using proj_size * add scaling converter for ScaledLSTM * support jit trace * add using averaged model in export.py * modify test_model.py, test if the model can be successfully exported by jit.trace * modify pretrained.py * support streaming decoding * fix model.py * Add cut_id to recognition results * Add cut_id to recognition results * do not pad in Conv subsampling module; add tail padding during decoding. * update RESULTS.md * minor fix * fix doc * update README.md * minor change, filter infinite loss * remove the condition of raise error * modify type hint for the return value in model.py * minor change * modify RESULTS.md Co-authored-by: pkufool <wkang.pku@gmail.com>	2022-08-19 14:38:45 +08:00
Zengwei Yao	bc2882ddcc	Simplified memory bank for Emformer (#440 ) * init files * use average value as memory vector for each chunk * change tail padding length from right_context_length to chunk_length * correct the files, ln -> cp * fix bug in conv_emformer_transducer_stateless2/emformer.py * fix doc in conv_emformer_transducer_stateless/emformer.py * refactor init states for stream * modify .flake8 * fix bug about memory mask when memory_size==0 * add @torch.jit.export for init_states function * update RESULTS.md * minor change * update README.md * modify doc * replace torch.div() with << * fix bug, >> -> << * use i&i-1 to judge if it is a power of 2 * minor fix * fix error in RESULTS.md	2022-07-12 19:19:58 +08:00
Zengwei Yao	53f38c01d2	Emformer with conv module and scaling mechanism (#389 ) * copy files from existing branch * add rule in .flake8 * monir style fix * fix typos * add tail padding * refactor, use fixed-length cache for batch decoding * copy from streaming branch * copy from streaming branch * modify emformer states stack and unstack, streaming decoding, to be continued * refactor Stream class * remane streaming_feature_extractor.py * refactor streaming decoding * test states stack and unstack * fix bugs, no grad, and num_proccessed_frames * add modify_beam_search, fast_beam_search * support torch.jit.export * use torch.div * copy from pruned_transducer_stateless4 * modify export.py * add author info * delete other test functions * minor fix * modify doc * fix style * minor fix doc * minor fix * minor fix doc * update RESULTS.md * fix typo * add info * fix typo * fix doc * add test function for conv module, and minor fix. * add copyright info * minor change of test_emformer.py * fix doc of stack and unstack, test case with batch_size=1 * update README.md	2022-06-13 15:09:17 +08:00
Fangjun Kuang	fbfc98f1d3	Add streaming Emformer stateless RNN-T. (#390 ) * Add streaming Emformer stateless RNN-T. * Update results for streaming Emformer. * Minor fixes.	2022-06-01 14:31:47 +08:00
LIyong.Guo	c4ee2bc0af	[Ready to merge]stateless6: states4 + hubert distillation. (#387 ) * a copy of stateless4 as base * distillation with hubert * fix typo * example usage * usage * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * fix comment * add results of 100hours * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * check fairseq and quantization * a short intro to distillation framework * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * add intro of statless6 in README * fix type error of dst_manifest_dir * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * make export.py call stateless6/train.py instead of stateless2/train.py * update results by stateless6 * adjust results format * fix typo Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2022-05-28 12:37:50 +08:00
Fangjun Kuang	2f1e23cde1	Narrower and deeper conformer (#330 ) * Copy files for editing. * Add random combine from #229. * Minor fixes. * Pass model parameters from the command line. * Fix warnings. * Fix warnings. * Update readme. * Rename to avoid conflicts. * Update results. * Add CI for pruned_transducer_stateless5 * Typo fixes. * Remove random combiner. * Update decode.py and train.py to use periodically averaged models. * Minor fixes. * Revert to use random combiner. * Update results. * Minor fixes.	2022-05-23 14:39:11 +08:00
Fangjun Kuang	6dc2e04462	Update results. (#340 ) * Update results. * Typo fixes.	2022-04-29 15:49:45 +08:00
Fangjun Kuang	ac84220de9	Modified conformer with multi datasets (#312 ) * Copy files for editing. * Use librispeech + gigaspeech with modified conformer. * Support specifying number of workers for on-the-fly feature extraction. * Feature extraction code for GigaSpeech. * Combine XL splits lazily during training. * Fix warnings in decoding. * Add decoding code for GigaSpeech. * Fix decoding the gigaspeech dataset. We have to use the decoder/joiner networks for the GigaSpeech dataset. * Disable speed perturbe for XL subset. * Compute the Nbest oracle WER for RNN-T decoding. * Minor fixes. * Minor fixes. * Add results. * Update results. * Update CI. * Update results. * Fix style issues. * Update results. * Fix style issues.	2022-04-29 15:40:30 +08:00
Fangjun Kuang	fce7f3cd9a	Support computing RNN-T loss with torchaudio (#316 )	2022-04-19 18:47:13 +08:00
Daniel Povey	e8eb0b94d9	Updating RESULTS.md; fix in beam_search.py	2022-04-11 21:00:11 +08:00
Fangjun Kuang	bb7f6ed6b7	Add modified beam search for pruned rnn-t. (#248 ) * Add modified beam search for pruned rnn-t. * Fix style issues. * Update RESULTS.md. * Fix typos. * Minor fixes. * Test the pre-trained model using GitHub actions. * Let the user install optimized_transducer on her own. * Fix errors in GitHub CI.	2022-03-12 16:16:55 +08:00
Fangjun Kuang	3ec219dfa0	Add stateless transducer tutorial. (#235 ) * WIP: Add stateless transducer tutorial. * Add more doc. * Minor fixes.	2022-03-03 22:33:47 +08:00
Fangjun Kuang	2332ba312d	Begin to use multiple datasets in training (#213 ) * Begin to use multiple datasets. * Finish preparing training datasets. * Minor fixes * Copy files. * Finish training code. * Display losses for gigaspeech and librispeech separately. * Fix decode.py * Make the probability to select a batch from GigaSpeech configurable. * Update results. * Minor fixes.	2022-02-21 15:27:27 +08:00
Fangjun Kuang	fb6a57e9e0	Increase the size of the context in the RNN-T decoder. (#153 )	2021-12-23 07:55:02 +08:00
Fangjun Kuang	96e7f5c7ea	Release v0.1 (#26 )	2021-08-24 21:30:30 +08:00
Fangjun Kuang	12a2fd023e	Add doc about installation and usage (#7 ) * Add readme. * Add TOC. * fix typos * Minor fixes after review.	2021-08-12 12:44:04 +08:00
Fangjun Kuang	f65854cca5	Add BPE decoding results.	2021-07-27 17:38:47 +08:00
Fangjun Kuang	40eed74460	Download LM for LibriSpeech.	2021-07-15 21:09:14 +08:00

19 Commits