icefall

Author	SHA1	Message	Date
Piotr Żelasko	f92c24a73a	Merge branch 'master' into feature/libri-conformer-phone-ctc	2022-01-24 10:18:56 -05:00
Piotr Żelasko	565c1d8413	Address code review	2022-01-24 10:17:47 -05:00
Piotr Żelasko	1d5fe8afa4	flake8	2022-01-21 17:27:02 -05:00
Piotr Żelasko	f0f35e6671	black	2022-01-21 17:22:41 -05:00
Piotr Żelasko	f28951f2b6	Add an assertion	2022-01-21 17:16:49 -05:00
Piotr Żelasko	3d109b121d	Remove train_phones.py and modify train.py instead	2022-01-21 17:08:53 -05:00
Fangjun Kuang	d6050eb02e	Fix calling optimized_transducer after new release. (#182 )	2022-01-21 08:18:50 +08:00
Fangjun Kuang	f94ff19bfe	Refactor beam search and update results. (#177 )	2022-01-18 16:40:19 +08:00
Fangjun Kuang	273e5fb2f3	Update git SHA1 for transducer_stateless model. (#174 )	2022-01-10 11:58:17 +08:00
Fangjun Kuang	4c1b3665ee	Use optimized_transducer to compute transducer loss. (#162 ) * WIP: Use optimized_transducer to compute transducer loss. * Minor fixes. * Fix decoding. * Fix decoding. * Add RESULTS. * Update RESULTS. * Update CI. * Fix sampling rate for yesno recipe.	2022-01-10 11:54:58 +08:00
Fangjun Kuang	413b2e8569	Add git sha1 to RESULTS.md for conformer encoder + stateless decoder. (#160 )	2021-12-28 12:04:01 +08:00
Fangjun Kuang	14c93add50	Remove batchnorm, weight decay, and SOS from transducer conformer encoder (#155 ) * Remove batchnorm, weight decay, and SOS. * Make --context-size configurable. * Update results.	2021-12-27 16:01:10 +08:00
Fangjun Kuang	8187d6236c	Minor fix to maximum number of symbols per frame for RNN-T decoding. (#157 ) * Minor fix to maximum number of symbols per frame RNN-T decoding. * Minor fixes.	2021-12-24 21:48:40 +08:00
Fangjun Kuang	5b6699a835	Minor fixes to the RNN-T Conformer model (#152 ) * Disable weight decay. * Remove input feature batchnorm.. * Replace BatchNorm in the Conformer model with LayerNorm. * Use tanh in the joint network. * Remove sos ID. * Reduce the number of decoder layers from 4 to 2. * Minor fixes. * Fix typos.	2021-12-23 13:54:25 +08:00
Fangjun Kuang	fb6a57e9e0	Increase the size of the context in the RNN-T decoder. (#153 )	2021-12-23 07:55:02 +08:00
Fangjun Kuang	cb04c8a750	Limit the number of symbols per frame in RNN-T decoding. (#151 )	2021-12-18 11:00:42 +08:00
Fangjun Kuang	1d44da845b	RNN-T Conformer training for LibriSpeech (#143 ) * Begin to add RNN-T training for librispeech. * Copy files from conformer_ctc. Will edit it. * Use conformer/transformer model as encoder. * Begin to add training script. * Add training code. * Remove long utterances to avoid OOM when a large max_duraiton is used. * Begin to add decoding script. * Add decoding script. * Minor fixes. * Add beam search. * Use LSTM layers for the encoder. Need more tunings. * Use stateless decoder. * Minor fixes to make it ready for merge. * Fix README. * Update RESULT.md to include RNN-T Conformer. * Minor fixes. * Fix tests. * Minor fixes. * Minor fixes. * Fix tests.	2021-12-18 07:42:51 +08:00
Wei Kang	a183d5bfd7	Remove batchnorm (#147 ) * Remove batch normalization * Minor fixes * Fix typo * Fix comments * Add assertion for use_feat_batchnorm	2021-12-14 08:20:03 +08:00
Fangjun Kuang	1aff64b708	Apply layer normalization to the output of each gate in LSTM/GRU. (#139 ) * Apply layer normalization to the output of each gate in LSTM. * Apply layer normalization to the output of each gate in GRU. * Add projection support to LayerNormLSTMCell. * Add GPU tests. * Use typeguard.check_argument_types() to validate type annotations. * Add typeguard as a requirement. * Minor fixes. * Fix CI. * Fix CI. * Fix test failures for torch 1.8.0 * Fix errors.	2021-12-07 18:38:03 +08:00
Fangjun Kuang	ec591698b0	Associate a cut with token alignment (without repeats) (#125 ) * WIP: Associate a cut with token alignment (without repeats) * Save framewise alignments with/without repeats. * Minor fixes.	2021-11-29 18:50:54 +08:00
Fangjun Kuang	243fb9723c	Fix an error introduced while supporting torchscript. (#134 ) Should be `G.dummy = 1`, not `G["dummy"] = 1`.	2021-11-27 09:07:04 +08:00
Fangjun Kuang	0e541f5b5d	Print hostname and IP address to the log. (#131 ) We are using multiple machines to do various experiments. It makes life easier to know which experiment is running on which machine if we also log the IP and hostname of the machine.	2021-11-26 11:25:59 +08:00
LIyong.Guo	00e2f0ade8	Draft streaming decoding (#89 ) * reusable parts from conformer_ctc * streaming conformer code * a trained model	2021-11-24 19:35:18 +08:00
Piotr Żelasko	8eb94fa4a0	CTC-only phone conformer recipe for LibriSpeech	2021-11-23 15:34:46 -05:00
Wei Kang	4151cca147	Add torch script support for Aishell and update documents (#124 ) * Add aishell recipe * Remove unnecessary code and update docs * adapt to k2 v1.7, add docs and results * Update conformer ctc model * Update docs, pretrained.py & results * Fix code style * Fix code style * Fix code style * Minor fix * Minor fix * Fix pretrained.py * Update pretrained model & corresponding docs * Export torch script model for Aishell * Add C++ deployment docs * Minor fixes * Fix unit test * Update Readme	2021-11-19 16:37:05 +08:00
Fangjun Kuang	0660d12e4e	Fix computing WERs for empty hypotheses (#118 ) * Fix computing WERs when empty lattices are generated. * Minor fixes.	2021-11-17 19:25:47 +08:00
Fangjun Kuang	336283f872	New label smoothing (#109 ) * Modify label smoothing to match the one implemented in PyTorch. * Enable CI for torch 1.10 * Fix CI errors. * Fix CI installation errors. * Fix CI installation errors. * Minor fixes. * Minor fixes. * Minor fixes. * Minor fixes. * Minor fixes. * Fix CI errors.	2021-11-17 19:24:07 +08:00
Fangjun Kuang	68506609ad	Set fsa.properties to None after changing its labels in-place. (#121 )	2021-11-16 23:11:30 +08:00
Fangjun Kuang	8d679c3e74	Fix typos. (#115 )	2021-11-10 14:45:30 +08:00
Fangjun Kuang	21096e99d8	Update result for the librispeech recipe using vocab size 500 and att rate 0.8 (#113 ) * Update RESULTS using vocab size 500, att rate 0.8 * Update README. * Refactoring. Since FSAs in an Nbest object are linear in structure, we can add the scores of a path to compute the total scores. * Update documentation. * Change default vocab size from 5000 to 500.	2021-11-10 14:32:52 +08:00
Fangjun Kuang	42b437bea6	Use pre-sorted text to generate token ids for attention decoder. (#98 ) * Use pre-sorted text to generate token ids for attention decoder. See https://github.com/k2-fsa/icefall/issues/97 for more details. * Fix typos.	2021-10-29 13:46:41 +08:00
Fangjun Kuang	8cb7f712e4	Use GPU for averaging checkpoints if possible. (#84 )	2021-10-26 17:10:04 +08:00
Fangjun Kuang	712ead8207	Fix an error when attention decoder rescoring returns None. (#90 )	2021-10-22 19:52:25 +08:00
Piotr Żelasko	3cc99d2af2	make flake8 happy	2021-10-19 11:24:54 -04:00
Piotr Żelasko	86f3e0ef37	Make flake8 happy	2021-10-18 09:54:40 -04:00
Piotr Żelasko	6fbd7a287c	Refactor OOM batch scanning into a local function	2021-10-18 09:53:04 -04:00
Piotr Żelasko	d509d58f30	Merge branch 'master' into feature/find-pessimistic-batches	2021-10-18 09:47:21 -04:00
Fangjun Kuang	3effcb4225	Fix typos. (#85 )	2021-10-18 16:17:14 +08:00
Fangjun Kuang	53b79fafa7	Add MMI training with word pieces as modelling unit. (#6 ) * Fix an error in TDNN-LSTM training. * WIP: Refactoring * Refactor transformer.py * Remove unused code. * Minor fixes. * Fix decoder padding mask. * Add MMI training with word pieces. * Remove unused files. * Minor fixes. * Refactoring. * Minor fixes. * Use pre-computed alignments in LF-MMI training. * Minor fixes. * Update decoding script. * Add doc about how to check and use extracted alignments. * Fix style issues. * Fix typos. * Fix style issues. * Disable macOS tests for now.	2021-10-18 15:20:32 +08:00
Fangjun Kuang	4890e27b45	Extract framewise alignment information using CTC decoding (#39 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Extract framewise alignment information using CTC decoding. * Print environment information. Print information about k2, lhotse, PyTorch, and icefall. * Fix CI. * Fix CI. * Compute framewise alignment information of the LibriSpeech dataset. * Update comments for the time to compute alignments of train-960. * Preserve cut id in mix cut transformer. * Minor fixes. * Add doc about how to extract framewise alignments.	2021-10-18 14:24:33 +08:00
Piotr Żelasko	403d1744ff	Introduce backprop in finding OOM batches	2021-10-15 10:05:13 -04:00
Piotr Żelasko	060117a9ff	Reformatting	2021-10-14 21:40:14 -04:00
Piotr Żelasko	1c7c79f2fc	Find CUDA OOM batches before starting training	2021-10-14 21:28:11 -04:00
Fangjun Kuang	fee1f84b20	Test pre-trained model in CI (#80 ) * Add CI to run pre-trained models. * Minor fixes. * Install kaldifeat * Install a CPU version of PyTorch. * Fix CI errors. * Disable decoder layers in pretrained.py if it is not used. * Clone pre-trained model from GitHub. * Minor fixes. * Minor fixes. * Minor fixes.	2021-10-15 00:41:33 +08:00
Mingshuang Luo	5401ce199d	Update ctc-decoding on pretrained.py and conformer_ctc.rst (#78 )	2021-10-14 23:29:06 +08:00
Fangjun Kuang	f2387fe523	Fix a bug introduced while supporting torch script. (#79 )	2021-10-14 20:09:38 +08:00
Fangjun Kuang	5016ee3c95	Give an informative message when users provide an unsupported decoding method (#77 )	2021-10-14 16:20:35 +08:00
Mingshuang Luo	39bc8cae94	Add ctc decoding to pretrained.py on conformer_ctc (#75 ) * Add ctc-decoding to pretrained.py * update pretrained.py and conformer_ctc.rst * update ctc-decoding for pretrained.py on conformer_ctc * Update pretrained.py * fix the style issue * Update conformer_ctc.rst * Update the running logs	2021-10-13 12:20:16 +08:00
Mingshuang Luo	391432b356	Update train.py ("10"--->"params.log_interval") (#76 ) * Update train.py * Update train.py * Update train.py	2021-10-12 21:30:31 +08:00
Mingshuang Luo	597c5efdb1	Use LossRecord to record and print the loss for the training process (#62 ) * Update index.rst (AS->ASR) * Update conformer_ctc.rst (pretraind->pretrained) * Fix some spelling errors. * Fix some spelling errors. * Use LossRecord to record and print loss in the training process * Change the name "LossRecord" to "MetricsTracker"	2021-10-12 15:58:03 +08:00

... 9 10 11 12 13

645 Commits