icefall

Author	SHA1	Message	Date
Fangjun Kuang	feb526c2a4	Predicting blanks via gradients from the trivial joiner.	2022-03-31 20:12:41 +08:00
Fangjun Kuang	239a8fa1f2	Copy files for editing.	2022-03-31 17:34:06 +08:00
Fangjun Kuang	e7d369ab29	Copy files.	2022-03-31 16:51:27 +08:00
Fangjun Kuang	9a11808ed3	Set the seed for dataloader. (#282 ) Also, suppress torch warnings about division by truncation.	2022-03-31 16:48:46 +08:00
Fangjun Kuang	395a3f952b	Batch decoding for models trained with optimized_transducer (#267 ) * Add greedy search in batch mode. * Add modified beam search in batch mode.	2022-03-23 19:11:34 +08:00
Fangjun Kuang	3ae7265737	More fixes to the checkpoint code. (#266 )	2022-03-23 14:37:54 +08:00
Fangjun Kuang	6a091da0b0	Minor fixes for saving checkpoints. (#265 ) * Minor fixes for saving checkpoints. * Fix loading checkpoints saved by previous code.	2022-03-23 12:22:05 +08:00
Fangjun Kuang	8c7995d493	Support modified beam search in batch mode. (#264 ) * Support modified beam search in batch mode. * Update k2 versions in GitHub CI.	2022-03-22 15:14:04 +08:00
Fangjun Kuang	d5c78a2238	Implement greedy search in batch mode for transducer decoding. (#262 )	2022-03-22 10:32:22 +08:00
Wei Kang	b2b4d9e0b6	Add fast beam search decoding (#250 ) * Add fast beam search decoding * Minor fixes * Minor fixes * Minor fixes * Fix comments * Fix comments	2022-03-21 16:22:25 +08:00
Fangjun Kuang	ae564f91e6	Periodically saving checkpoint after processing given number of batches (#259 ) * Periodically saving checkpoint after processing given number of batches.	2022-03-20 23:51:33 +08:00
Mingshuang Luo	d0d806560f	Change for asr_datamodule.py (#241 ) * change for asr_datamodule.py * fix style check * do a fix	2022-03-14 00:30:58 +08:00
Fangjun Kuang	bb7f6ed6b7	Add modified beam search for pruned rnn-t. (#248 ) * Add modified beam search for pruned rnn-t. * Fix style issues. * Update RESULTS.md. * Fix typos. * Minor fixes. * Test the pre-trained model using GitHub actions. * Let the user install optimized_transducer on her own. * Fix errors in GitHub CI.	2022-03-12 16:16:55 +08:00
Fangjun Kuang	2f4e71f433	Add force alignment for stateless transducer. (#239 ) * Add force alignment for stateless transducer. * Add more documentation. * Compute word starting time from framewise token alignment. * Update README to include force alignment information. * Fix typos. * Fix more typos. * Fixes after review.	2022-03-12 16:16:15 +08:00
Fangjun Kuang	1603744469	Refactor conformer. (#237 )	2022-03-05 19:26:06 +08:00
yaozengwei	ad62981765	Add diagnostics (#230 ) * Adding diagnostics code... * Move diagnostics code from local dir to the shared icefall dir * Remove the diagnostics code in the local dir * Update docs of arguments, and remove stats_types() function in TensorDiagnosticOptions object. * Update docs of arguments. * Add copyright information. * Corrected the time in copyright information. Co-authored-by: Daniel Povey <dpovey@gmail.com>	2022-03-04 15:38:23 +08:00
Fangjun Kuang	3ec219dfa0	Add stateless transducer tutorial. (#235 ) * WIP: Add stateless transducer tutorial. * Add more doc. * Minor fixes.	2022-03-03 22:33:47 +08:00
Fangjun Kuang	1ff6196c44	Fix joiner (#234 ) * Add tests for Joiner * Remove duplicate files.	2022-03-02 16:41:14 +08:00
Fangjun Kuang	50d2281524	Add modified transducer loss for AIShell dataset (#219 ) * Add modified transducer for aishell. * Minor fixes. * Add extra data in transducer training. The extra data is from http://www.openslr.org/62/ * Update export.py and pretrained.py * Update CI to install pretrained models with aishell. * Update results. * Update results. * Update README. * Use symlinks to avoid copies.	2022-03-02 16:02:38 +08:00
Fangjun Kuang	05cb297858	Update result for full libri + GigaSpeech using transducer_stateless. (#231 )	2022-03-01 17:01:46 +08:00
Fangjun Kuang	72f838dee1	Update results for transducer_stateless after training for more epochs. (#207 )	2022-03-01 16:35:02 +08:00
Fangjun Kuang	2332ba312d	Begin to use multiple datasets in training (#213 ) * Begin to use multiple datasets. * Finish preparing training datasets. * Minor fixes * Copy files. * Finish training code. * Display losses for gigaspeech and librispeech separately. * Fix decode.py * Make the probability to select a batch from GigaSpeech configurable. * Update results. * Minor fixes.	2022-02-21 15:27:27 +08:00
Fangjun Kuang	1c35ae1dba	Reset seed at the beginning of each epoch. (#221 ) * Reset seed at the beginning of each epoch. * Use a different seed for each epoch.	2022-02-21 15:16:39 +08:00
Wei Kang	b702281e90	Use k2 pruned transducer loss to train conformer-transducer model (#194 ) * Using k2 pruned version transducer loss to train model * Fix style * Minor fixes	2022-02-17 13:33:54 +08:00
Wang, Guanbo	70a3c56a18	Fix librispeech train.py (#211 ) * fix librispeech train.py * remove note	2022-02-09 16:42:28 +08:00
Fangjun Kuang	27fa5f05d3	Update git SHA-1 in RESULTS.md for transducer_stateless. (#202 )	2022-02-07 18:45:45 +08:00
Fangjun Kuang	a8150021e0	Use modified transducer loss in training. (#179 ) * Use modified transducer loss in training. * Minor fix. * Add modified beam search. * Add modified beam search. * Minor fixes. * Fix typo. * Update RESULTS. * Fix a typo. * Minor fixes.	2022-02-07 18:37:36 +08:00
Wei Kang	35ecd7e562	Fix torch.nn.Embedding error for torch below 1.8.0 (#198 )	2022-02-06 21:59:54 +08:00
Wei Kang	5ae80dfca7	Minor fixes (#193 )	2022-01-27 18:01:17 +08:00
Piotr Żelasko	1731cc37bb	Black	2022-01-24 10:20:22 -05:00
Piotr Żelasko	f92c24a73a	Merge branch 'master' into feature/libri-conformer-phone-ctc	2022-01-24 10:18:56 -05:00
Piotr Żelasko	565c1d8413	Address code review	2022-01-24 10:17:47 -05:00
Piotr Żelasko	1d5fe8afa4	flake8	2022-01-21 17:27:02 -05:00
Piotr Żelasko	f0f35e6671	black	2022-01-21 17:22:41 -05:00
Piotr Żelasko	f28951f2b6	Add an assertion	2022-01-21 17:16:49 -05:00
Piotr Żelasko	3d109b121d	Remove train_phones.py and modify train.py instead	2022-01-21 17:08:53 -05:00
Fangjun Kuang	d6050eb02e	Fix calling optimized_transducer after new release. (#182 )	2022-01-21 08:18:50 +08:00
Fangjun Kuang	f94ff19bfe	Refactor beam search and update results. (#177 )	2022-01-18 16:40:19 +08:00
Fangjun Kuang	273e5fb2f3	Update git SHA1 for transducer_stateless model. (#174 )	2022-01-10 11:58:17 +08:00
Fangjun Kuang	4c1b3665ee	Use optimized_transducer to compute transducer loss. (#162 ) * WIP: Use optimized_transducer to compute transducer loss. * Minor fixes. * Fix decoding. * Fix decoding. * Add RESULTS. * Update RESULTS. * Update CI. * Fix sampling rate for yesno recipe.	2022-01-10 11:54:58 +08:00
Fangjun Kuang	413b2e8569	Add git sha1 to RESULTS.md for conformer encoder + stateless decoder. (#160 )	2021-12-28 12:04:01 +08:00
Fangjun Kuang	14c93add50	Remove batchnorm, weight decay, and SOS from transducer conformer encoder (#155 ) * Remove batchnorm, weight decay, and SOS. * Make --context-size configurable. * Update results.	2021-12-27 16:01:10 +08:00
Fangjun Kuang	8187d6236c	Minor fix to maximum number of symbols per frame for RNN-T decoding. (#157 ) * Minor fix to maximum number of symbols per frame RNN-T decoding. * Minor fixes.	2021-12-24 21:48:40 +08:00
Fangjun Kuang	5b6699a835	Minor fixes to the RNN-T Conformer model (#152 ) * Disable weight decay. * Remove input feature batchnorm.. * Replace BatchNorm in the Conformer model with LayerNorm. * Use tanh in the joint network. * Remove sos ID. * Reduce the number of decoder layers from 4 to 2. * Minor fixes. * Fix typos.	2021-12-23 13:54:25 +08:00
Fangjun Kuang	fb6a57e9e0	Increase the size of the context in the RNN-T decoder. (#153 )	2021-12-23 07:55:02 +08:00
Fangjun Kuang	cb04c8a750	Limit the number of symbols per frame in RNN-T decoding. (#151 )	2021-12-18 11:00:42 +08:00
Fangjun Kuang	1d44da845b	RNN-T Conformer training for LibriSpeech (#143 ) * Begin to add RNN-T training for librispeech. * Copy files from conformer_ctc. Will edit it. * Use conformer/transformer model as encoder. * Begin to add training script. * Add training code. * Remove long utterances to avoid OOM when a large max_duraiton is used. * Begin to add decoding script. * Add decoding script. * Minor fixes. * Add beam search. * Use LSTM layers for the encoder. Need more tunings. * Use stateless decoder. * Minor fixes to make it ready for merge. * Fix README. * Update RESULT.md to include RNN-T Conformer. * Minor fixes. * Fix tests. * Minor fixes. * Minor fixes. * Fix tests.	2021-12-18 07:42:51 +08:00
Wei Kang	a183d5bfd7	Remove batchnorm (#147 ) * Remove batch normalization * Minor fixes * Fix typo * Fix comments * Add assertion for use_feat_batchnorm	2021-12-14 08:20:03 +08:00
Fangjun Kuang	1aff64b708	Apply layer normalization to the output of each gate in LSTM/GRU. (#139 ) * Apply layer normalization to the output of each gate in LSTM. * Apply layer normalization to the output of each gate in GRU. * Add projection support to LayerNormLSTMCell. * Add GPU tests. * Use typeguard.check_argument_types() to validate type annotations. * Add typeguard as a requirement. * Minor fixes. * Fix CI. * Fix CI. * Fix test failures for torch 1.8.0 * Fix errors.	2021-12-07 18:38:03 +08:00
Fangjun Kuang	ec591698b0	Associate a cut with token alignment (without repeats) (#125 ) * WIP: Associate a cut with token alignment (without repeats) * Save framewise alignments with/without repeats. * Minor fixes.	2021-11-29 18:50:54 +08:00

1 2 3

125 Commits