icefall

Author	SHA1	Message	Date
Daniel Povey	7711fba867	Fix bugs; first version that is running successfully.	2021-08-23 22:40:23 +08:00
Daniel Povey	c3a8727446	Add train.py	2021-08-23 22:28:45 +08:00
Daniel Povey	894be068e7	Update prepare.sh to create LM training data; add missed scripts local/prepare_lm_training_data.py	2021-08-23 19:51:58 +08:00
Daniel Povey	13200d707b	Merge remote-tracking branch 'upstream/master'	2021-08-23 19:13:15 +08:00
Daniel Povey	26b5b5ba46	Get tests to work for MaskedLmConformer	2021-08-23 19:05:31 +08:00
Daniel Povey	5fecd24664	Test, and fix, TransformerDecoderRelPos	2021-08-23 17:48:00 +08:00
Daniel Povey	7856ab89fc	Test, and fix, TransformerDecoderLayerRelPos	2021-08-23 17:39:37 +08:00
Daniel Povey	556fae586f	Add testing for MaskedLmConformerEncoder	2021-08-23 17:22:03 +08:00
Daniel Povey	2fbe3b78fd	Add more testing; fix issue about channel dim of LayerNorm.	2021-08-23 17:18:00 +08:00
Fangjun Kuang	57cb611665	[yesno] Remove padding in TDNN (#21 ) * Disable SpecAug for yesno. Also replace Adam with SGD. * Remove padding in the model to make the results reproducible.	2021-08-23 15:59:36 +08:00
Daniel Povey	e0b04ba54f	Progress in testing	2021-08-23 15:38:37 +08:00
Fangjun Kuang	6c2c9b9d74	Add recipe for the yes_no dataset. (#16 ) * Add recipe for the yes_no dataset. * Refactoring: Remove unused code. * Add Colab notebook for the yesno dataset. * Add GitHub actions to run yesno. * Fix a typo. * Minor fixes. * Train more epochs for GitHub actions. * Minor fixes. * Minor fixes. * Fix style issues.	2021-08-23 11:36:29 +08:00
Daniel Povey	03ff4aab2f	Some progress on refactoring conformer code, it's in transformer.py only...	2021-08-23 11:11:09 +08:00
pkufool	19c4214958	Fix code style and add copyright. (#18 ) * Fix style and add copyright * Minor fix * Remove duplicate lines * Reformat conformer.py by black * Reformat code style with black. * Fix github workflows * Fix lhotse installation * Install icefall requirements * Update k2 version, remove lhotse from test workflow	2021-08-23 10:43:59 +08:00
Daniel Povey	24d3a98378	Merge remote-tracking branch 'upstream/master'	2021-08-22 11:56:45 +08:00
Daniel Povey	ea43b49ef2	Remove BatchNorm, use LayerNorm	2021-08-22 11:56:22 +08:00
Daniel Povey	076a70b62d	Initial conformer refactoring, not nearly done	2021-08-22 11:47:26 +08:00
Daniel Povey	cbe5ee1111	Copy some files, will edit..	2021-08-21 22:35:43 +08:00
Daniel Povey	421a41027a	Get dataset.py working..	2021-08-21 18:23:46 +08:00
Fangjun Kuang	8469f9ae0a	Refactor asr_datamodule. (#15 ) * WIP: Refactor asr_datamodule. * Fixes after review. * Minor fixes.	2021-08-21 09:53:46 +08:00
Fangjun Kuang	0b656e4e1c	Add a link to Colab. (#14 ) It demonstrates the usages of pre-trained models.	2021-08-20 15:43:25 +08:00
Fangjun Kuang	9d0cc9d829	Support computing nbest oracle WER. (#10 ) * Support computing nbest oracle WER. * Add scale to all nbest based decoding/rescoring methods. * Add script to run pretrained models. * Use torchaudio to extract features. * Support decoding multiple files at the same time. Also, use kaldifeat for feature extraction. * Support decoding with LM rescoring and attention-decoder rescoring. * Minor fixes. * Replace scale with lattice-score-scale. * Add usage example with a provided pretrained model.	2021-08-20 11:53:37 +08:00
pkufool	ef233486ae	The training script produce WER of 2.57% on librispeech test-clean (#13 ) * Add grad_clip and weight-decay, small fix of dataloader and masking * Add RESULTS.md	2021-08-20 10:08:08 +08:00
Fangjun Kuang	caa0b9e942	Fix an error in displaying decoding process. (#12 )	2021-08-19 14:54:01 +08:00
Fangjun Kuang	1c3b13c7eb	Minor fixes. (#9 )	2021-08-16 19:01:25 +08:00
Fangjun Kuang	12a2fd023e	Add doc about installation and usage (#7 ) * Add readme. * Add TOC. * fix typos * Minor fixes after review.	2021-08-12 12:44:04 +08:00
Fangjun Kuang	5a0b9bcb23	Refactoring (#4 ) * Fix an error in TDNN-LSTM training. * WIP: Refactoring * Refactor transformer.py * Remove unused code. * Minor fixes.	2021-08-04 14:53:02 +08:00
Daniel Povey	cf8d76293d	Merge pull request #3 from csukuangfj/style-check Add CTC training	2021-07-31 15:36:00 +08:00
Fangjun Kuang	398ed80d7a	Minor fixes to support DDP training.	2021-07-31 15:26:57 +08:00
Fangjun Kuang	b94d97da37	Disable gradient computation in evaluation mode.	2021-07-29 20:37:31 +08:00
Fangjun Kuang	acc63a9172	WIP: Add BPE training code.	2021-07-29 20:23:52 +08:00
Fangjun Kuang	bd69e4be32	Use attention decoder for rescoring.	2021-07-28 12:22:09 +08:00
Fangjun Kuang	f65854cca5	Add BPE decoding results.	2021-07-27 17:38:47 +08:00
Fangjun Kuang	4ccae509d3	WIP: Begin to add BPE decoding	2021-07-26 20:06:58 +08:00
Fangjun Kuang	d3101fb005	Fix loading checkpoint in DDP training.	2021-07-26 08:08:14 +08:00
Fangjun Kuang	78bb65ed78	Fix an error in DDP training.	2021-07-25 22:33:09 +08:00
Fangjun Kuang	8055bf31a0	Support DDP training.	2021-07-25 21:40:09 +08:00
Fangjun Kuang	4a66712406	Add LM rescoring.	2021-07-25 18:21:26 +08:00
Fangjun Kuang	6f9fe5b906	Refactor decoding code.	2021-07-24 22:23:50 +08:00
Fangjun Kuang	00f8371f37	begin to add LM rescoring.	2021-07-24 18:24:04 +08:00
Fangjun Kuang	a9095925ba	Fix CI test errors.	2021-07-24 18:13:03 +08:00
Fangjun Kuang	54436182a4	Fix CI.	2021-07-24 18:05:19 +08:00
Fangjun Kuang	ee83a3e67c	Fix CI dependencies installation.	2021-07-24 17:55:45 +08:00
Fangjun Kuang	2e33e24348	Add CI test.	2021-07-24 17:47:41 +08:00
Fangjun Kuang	f3542c7793	Add CTC training.	2021-07-24 17:13:20 +08:00
Fangjun Kuang	a01d08f73c	Add self-loops to propagate disambiguation symbols.	2021-07-21 13:12:20 +08:00
Fangjun Kuang	8a72901f3a	Minor fixes.	2021-07-20 19:54:12 +08:00
Fangjun Kuang	d5e0408698	Add prepare_lang.py based on prepare_lang.sh	2021-07-20 19:41:21 +08:00
Fangjun Kuang	e005ea062c	Minor fixes after review.	2021-07-20 10:02:20 +08:00
Fangjun Kuang	f25eedf2d4	Fixes after review.	2021-07-20 00:14:24 +08:00

1 2

55 Commits