icefall

Author	SHA1	Message	Date
Daniel Povey	7d65f6480a	Merge 8e650a584134e9cd42216427ac1f2f2a0ae45b74 into bd7c2f7645c0aea4cb482cc1f60836907a61d36b	2021-10-17 00:34:19 +03:30
Fangjun Kuang	fee1f84b20	Test pre-trained model in CI (#80 ) * Add CI to run pre-trained models. * Minor fixes. * Install kaldifeat * Install a CPU version of PyTorch. * Fix CI errors. * Disable decoder layers in pretrained.py if it is not used. * Clone pre-trained model from GitHub. * Minor fixes. * Minor fixes. * Minor fixes.	2021-10-15 00:41:33 +08:00
Mingshuang Luo	5401ce199d	Update ctc-decoding on pretrained.py and conformer_ctc.rst (#78 )	2021-10-14 23:29:06 +08:00
Fangjun Kuang	f2387fe523	Fix a bug introduced while supporting torch script. (#79 )	2021-10-14 20:09:38 +08:00
Fangjun Kuang	5016ee3c95	Give an informative message when users provide an unsupported decoding method (#77 )	2021-10-14 16:20:35 +08:00
Mingshuang Luo	39bc8cae94	Add ctc decoding to pretrained.py on conformer_ctc (#75 ) * Add ctc-decoding to pretrained.py * update pretrained.py and conformer_ctc.rst * update ctc-decoding for pretrained.py on conformer_ctc * Update pretrained.py * fix the style issue * Update conformer_ctc.rst * Update the running logs	2021-10-13 12:20:16 +08:00
Mingshuang Luo	391432b356	Update train.py ("10"--->"params.log_interval") (#76 ) * Update train.py * Update train.py * Update train.py	2021-10-12 21:30:31 +08:00
Mingshuang Luo	597c5efdb1	Use LossRecord to record and print the loss for the training process (#62 ) * Update index.rst (AS->ASR) * Update conformer_ctc.rst (pretraind->pretrained) * Fix some spelling errors. * Fix some spelling errors. * Use LossRecord to record and print loss in the training process * Change the name "LossRecord" to "MetricsTracker"	2021-10-12 15:58:03 +08:00
Fangjun Kuang	beb54ddb61	Support torch script. (#65 ) * WIP: Support torchscript. * Minor fixes. * Fix style issues. * Add documentation about how to deploy a trained model.	2021-10-12 14:55:05 +08:00
Piotr Żelasko	069ebaf9ba	Reformatting	2021-10-09 14:45:46 +00:00
Piotr Żelasko	b682467e4d	Use BucketingSampler for dev and test data	2021-10-08 22:32:13 -04:00
Daniel Povey	8e650a5841	Update egs/librispeech/ASR/conformer_lm/conformer.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2021-09-27 12:19:01 +08:00
Fangjun Kuang	707d7017a7	Support pure ctc decoding requiring neither a lexicon nor an n-gram LM (#58 ) * Rename lattice_score_scale to nbest_scale. * Support pure CTC decoding requiring neither a lexicion nor an n-gram LM. * Fix style issues. * Fix a typo. * Minor fixes.	2021-09-26 14:21:49 +08:00
Daniel Povey	0cfa8c80e0	Merge remote-tracking branch 'upstream/master' into conformer_lm	2021-09-24 11:21:50 +08:00
Fangjun Kuang	a80e58e15d	Refactor decode.py to make it more readable and more modular. (#44 ) * Refactor decode.py to make it more readable and more modular. * Fix an error. Nbest.fsa should always have token IDs as labels and word IDs as aux_labels. * Add nbest decoding. * Compute edit distance with k2. * Refactor nbest-oracle. * Add rescore with nbest lists. * Add whole-lattice rescoring. * Add rescoring with attention decoder. * Refactoring. * Fixes after refactoring. * Fix a typo. * Minor fixes. * Replace [] with () for shapes. * Use k2 v1.9 * Use Levenshtein graphs/alignment from k2 v1.9 * [doc] Require k2 >= v1.9 * Minor fixes.	2021-09-20 15:44:54 +08:00
Wei Kang	9a6e0489c8	update api for RaggedTensor (#45 ) * Fix code style * update k2 version in CI * fix compile hlg	2021-09-14 16:39:56 +08:00
Daniel Povey	3ce1de337d	UPdates for new k2 version; change LR decay from 0.85 to 0.9	2021-09-13 20:57:02 +08:00
Wei Kang	24656e9749	Update docs and remove unnecessary arguments (#42 ) * Fix typo in docs * Update docs and remove unnecessary arguments * Fix code style	2021-09-13 18:28:57 +08:00
Fangjun Kuang	f792b466bf	Change default value of lattice-score-scale from 1.0 to 0.5 (#41 ) * Change the default value of lattice-score-scale from 1.0 to 0.5 * Fix CI.	2021-09-13 10:49:18 +08:00
Daniel Povey	d0e5b9b8a5	Change to exp_5, 1/sqrt(t) component.	2021-09-09 14:08:19 +08:00
Fangjun Kuang	7f8e3a673a	Add commands for reproducing. (#40 ) * Add commands for reproducing. * Use --bucketing-sampler by default.	2021-09-09 13:50:31 +08:00
Fangjun Kuang	abadc71415	Use new APIs with k2.RaggedTensor (#38 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Use k2 v1.7	2021-09-08 14:55:30 +08:00
Daniel Povey	56a88badd1	Move to Gloam optimizer, exponential lrate	2021-09-08 13:59:50 +08:00
Daniel Povey	d313c27c14	Change configuration again.. not great performance.	2021-09-07 20:58:00 +08:00
Daniel Povey	573e0582d8	Run in exp_2, with foam from start, knee_factor=5.0, initial_lrate=2e-04.	2021-08-30 14:10:21 +08:00
Daniel Povey	ccf7bdec23	Add Foam optimizer; I used this from epoch 3.	2021-08-28 21:51:54 +08:00
Daniel Povey	d045831a4f	Get dataset to work for empty input sentences; test it	2021-08-25 15:54:36 +08:00
Fangjun Kuang	184dbb3ea5	Add documentation about code style and creating new recipes. (#27 )	2021-08-25 14:48:41 +08:00
Daniel Povey	a7b61100de	Use collate_fn as class. harmless but not necessary without multiple workers	2021-08-25 11:27:47 +08:00
Daniel Povey	0d97e689be	Version I am running...	2021-08-24 21:59:41 +08:00
Fangjun Kuang	96e7f5c7ea	Release v0.1 (#26 )	2021-08-24 21:30:30 +08:00
pkufool	f4223ee110	Add TDNN-LSTM-CTC Results (#25 ) * Add tdnn-lstm pretrained model and results * Add docs for TDNN-LSTM-CTC * Minor fix * Fix typo * Fix style checking	2021-08-24 21:09:27 +08:00
Fangjun Kuang	1bd5dcc8ac	WIP: Add doc for the LibriSpeech recipe. (#24 ) * WIP: Add doc for the LibriSpeech recipe. * Add more doc for LibriSpeech recipe. * Add more doc for the LibriSpeech recipe. * More doc.	2021-08-24 20:28:32 +08:00
Daniel Povey	e6eefeba88	Changes to dataset to prevent OOM on batches with short sentences	2021-08-24 14:50:49 +08:00
Daniel Povey	9576d6574f	Various bug fixes	2021-08-23 23:45:03 +08:00
Daniel Povey	7711fba867	Fix bugs; first version that is running successfully.	2021-08-23 22:40:23 +08:00
Daniel Povey	c3a8727446	Add train.py	2021-08-23 22:28:45 +08:00
Daniel Povey	894be068e7	Update prepare.sh to create LM training data; add missed scripts local/prepare_lm_training_data.py	2021-08-23 19:51:58 +08:00
Daniel Povey	13200d707b	Merge remote-tracking branch 'upstream/master'	2021-08-23 19:13:15 +08:00
Daniel Povey	26b5b5ba46	Get tests to work for MaskedLmConformer	2021-08-23 19:05:31 +08:00
Daniel Povey	5fecd24664	Test, and fix, TransformerDecoderRelPos	2021-08-23 17:48:00 +08:00
Daniel Povey	7856ab89fc	Test, and fix, TransformerDecoderLayerRelPos	2021-08-23 17:39:37 +08:00
Daniel Povey	556fae586f	Add testing for MaskedLmConformerEncoder	2021-08-23 17:22:03 +08:00
Daniel Povey	2fbe3b78fd	Add more testing; fix issue about channel dim of LayerNorm.	2021-08-23 17:18:00 +08:00
Daniel Povey	e0b04ba54f	Progress in testing	2021-08-23 15:38:37 +08:00
Fangjun Kuang	6c2c9b9d74	Add recipe for the yes_no dataset. (#16 ) * Add recipe for the yes_no dataset. * Refactoring: Remove unused code. * Add Colab notebook for the yesno dataset. * Add GitHub actions to run yesno. * Fix a typo. * Minor fixes. * Train more epochs for GitHub actions. * Minor fixes. * Minor fixes. * Fix style issues.	2021-08-23 11:36:29 +08:00
Daniel Povey	03ff4aab2f	Some progress on refactoring conformer code, it's in transformer.py only...	2021-08-23 11:11:09 +08:00
pkufool	19c4214958	Fix code style and add copyright. (#18 ) * Fix style and add copyright * Minor fix * Remove duplicate lines * Reformat conformer.py by black * Reformat code style with black. * Fix github workflows * Fix lhotse installation * Install icefall requirements * Update k2 version, remove lhotse from test workflow	2021-08-23 10:43:59 +08:00
Daniel Povey	24d3a98378	Merge remote-tracking branch 'upstream/master'	2021-08-22 11:56:45 +08:00
Daniel Povey	ea43b49ef2	Remove BatchNorm, use LayerNorm	2021-08-22 11:56:22 +08:00

1 2

82 Commits