icefall

Author	SHA1	Message	Date
Fangjun Kuang	ac84220de9	Modified conformer with multi datasets (#312 ) * Copy files for editing. * Use librispeech + gigaspeech with modified conformer. * Support specifying number of workers for on-the-fly feature extraction. * Feature extraction code for GigaSpeech. * Combine XL splits lazily during training. * Fix warnings in decoding. * Add decoding code for GigaSpeech. * Fix decoding the gigaspeech dataset. We have to use the decoder/joiner networks for the GigaSpeech dataset. * Disable speed perturbe for XL subset. * Compute the Nbest oracle WER for RNN-T decoding. * Minor fixes. * Minor fixes. * Add results. * Update results. * Update CI. * Update results. * Fix style issues. * Update results. * Fix style issues.	2022-04-29 15:40:30 +08:00
Fangjun Kuang	caab6cfd92	Support specifying iteration number of checkpoints for decoding. (#336 ) See also #289	2022-04-28 14:09:22 +08:00
Fangjun Kuang	9aeea3e1af	Support averaging models with weight tying. (#333 )	2022-04-26 13:32:03 +08:00
pehonnet	9a98e6ced6	fix fp16 option in example usage (#332 )	2022-04-25 18:51:53 +08:00
whsqkaak	d766dc5aee	Fix some typos. (#329 )	2022-04-22 15:54:59 +08:00
Fangjun Kuang	3607c516d6	Update results for torchaudio RNN-T. (#322 )	2022-04-20 11:15:10 +08:00
Fangjun Kuang	fce7f3cd9a	Support computing RNN-T loss with torchaudio (#316 )	2022-04-19 18:47:13 +08:00
Wei Kang	021c79824e	Add LG decoding (#277 ) * Add LG decoding * Add log weight pushing * Minor fixes	2022-04-19 17:23:46 +08:00
Wang, Guanbo	5fe58de43c	GigaSpeech recipe (#120 ) * initial commit * support download, data prep, and fbank * on-the-fly feature extraction by default * support BPE based lang * support HLG for BPE * small fix * small fix * chunked feature extraction by default * Compute features for GigaSpeech by splitting the manifest. * Fixes after review. * Split manifests into 2000 pieces. * set audio duration mismatch tolerance to 0.01 * small fix * add conformer training recipe * Add conformer.py without pre-commit checking * lazy loading and use SingleCutSampler * DynamicBucketingSampler * use KaldifeatFbank to compute fbank for musan * use pretrained language model and lexicon * use 3gram to decode, 4gram to rescore * Add decode.py * Update .flake8 * Delete compute_fbank_gigaspeech.py * Use BucketingSampler for valid and test dataloader * Update params in train.py * Use bpe_500 * update params in decode.py * Decrease num_paths while CUDA OOM * Added README * Update RESULTS * black * Decrease num_paths while CUDA OOM * Decode with post-processing * Update results * Remove lazy_load option * Use default `storage_type` * Keep the original tolerance * Use split-lazy * black * Update pretrained model Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2022-04-14 16:07:22 +08:00
Mingshuang Luo	d88e786513	Changes for pretrained.py (tedlium3 pruned RNN-T) (#311 )	2022-04-14 09:54:07 +08:00
Daniel Povey	62fbfb52d0	Merge pull request #315 from danpovey/mixprec_md300 Add results for mixed precision with max-duration 300	2022-04-13 20:23:07 +08:00
Daniel Povey	af6ae840ee	Add results for mixed precision with max-duration 300	2022-04-13 20:22:11 +08:00
Daniel Povey	c0003483d3	Merge pull request #313 from glynpu/fix_comments fix comments	2022-04-13 14:03:02 +08:00
Guo Liyong	78418ac37c	fix comments	2022-04-13 13:09:24 +08:00
Daniel Povey	2a854f5607	Merge pull request #309 from danpovey/update_results Update results; will further update this before merge	2022-04-12 12:22:48 +08:00
Daniel Povey	9ed7a169e1	Add one more epoch of full expt	2022-04-12 12:20:10 +08:00
Daniel Povey	d0a53aad48	Fix tensorboard log location	2022-04-12 11:51:15 +08:00
Daniel Povey	65818d16de	Add more results	2022-04-12 11:48:16 +08:00
Fangjun Kuang	bdeff338c2	Fix CI errors. (#310 )	2022-04-12 09:09:56 +08:00
Mingshuang Luo	118e195004	Update results for tedlium3 pruned RNN-T (#307 ) * Update README.md	2022-04-11 22:19:26 +08:00
Mingshuang Luo	93c60a9d30	Code style check for librispeech pruned transducer stateless2 (#308 )	2022-04-11 22:15:18 +08:00
Daniel Povey	ead822477c	Fix rebase	2022-04-11 21:01:13 +08:00
Daniel Povey	e8eb0b94d9	Updating RESULTS.md; fix in beam_search.py	2022-04-11 21:00:11 +08:00
pkufool	a92133ef96	Minor fixes	2022-04-11 20:58:47 +08:00
pkufool	ddd8f9e15e	Minor fixes	2022-04-11 20:58:43 +08:00
pkufool	cc0d4ffa4f	Add mix precision support	2022-04-11 20:58:02 +08:00
Mingshuang Luo	8cb727e24a	Tedlium3 pruned transducer stateless (#261 ) * update tedlium3-pruned-transducer-stateless-codes * update README.md * update README.md * add fast beam search for decoding * do a change for RESULTS.md * do a change for RESULTS.md * do a fix * do some changes for pruned RNN-T	2022-04-11 17:08:53 +08:00
Wei Kang	7012fd65b5	Support mix precision training on the reworked model (#305 ) * Add mix precision support * Minor fixes * Minor fixes * Minor fixes	2022-04-11 16:49:54 +08:00
Daniel Povey	34aad74a2c	Merge pull request #303 from danpovey/fix_docs Fix docs in optim.py	2022-04-11 15:14:06 +08:00
Daniel Povey	03c7c2613d	Fix docs in optim.py	2022-04-11 15:13:42 +08:00
Daniel Povey	6eb6d9b4cd	Merge pull request #288 from danpovey/reworked_model Reworked model	2022-04-11 15:03:08 +08:00
Daniel Povey	5078332088	Fix adding learning rate to tensorboard	2022-04-11 14:58:15 +08:00
Daniel Povey	d5f9d49e53	Modify beam search to be efficient with current joienr	2022-04-11 12:35:29 +08:00
Daniel Povey	46d52dda10	Fix dir names	2022-04-11 12:03:41 +08:00
Wei Kang	f721a2fd7a	Minor fixes for logging (#296 ) * Minor fixes for logging * Minor fix	2022-04-10 23:34:18 +08:00
Zengwei Yao	08473a17aa	Modify init (#301 ) * update icefall/__init__.py to import more common functions. * update icefall/__init__.py * make imports style consistent. * exclude black check for icefall/__init__.py in pyproject.toml.	2022-04-10 23:29:28 +08:00
Daniel Povey	962cf868c9	Fix import	2022-04-10 15:31:46 +08:00
Daniel Povey	d1e4ae788d	Refactor how learning rate is set.	2022-04-10 15:25:27 +08:00
Daniel Povey	82d58629ea	Implement 2p version of learning rate schedule.	2022-04-10 13:50:31 +08:00
Daniel Povey	da50525ca5	Make lrate rule more symmetric	2022-04-10 13:25:40 +08:00
Daniel Povey	4d41ee0caa	Implement 2o schedule	2022-04-09 18:37:03 +08:00
Daniel Povey	db72aee1f0	Set 2n rule..	2022-04-09 18:15:56 +08:00
Daniel Povey	0f8ee68af2	Fix bug	2022-04-08 16:53:42 +08:00
Daniel Povey	f587cd527d	Change exponential part of lrate to be epoch based	2022-04-08 16:24:21 +08:00
Daniel Povey	6ee32cf7af	Set new scheduler	2022-04-08 16:10:06 +08:00
Fangjun Kuang	78b8792d1d	Fix potential bugs in PyTorch that exist in label_smoothing. (#300 )	2022-04-08 13:41:33 +08:00
Fangjun Kuang	7c0070e6f6	Display torch version in the training log. (#299 )	2022-04-08 11:39:54 +08:00
Daniel Povey	61486a0f76	Remove initial_speed	2022-04-06 13:17:26 +08:00
Daniel Povey	a41e93437c	Change some defaults in LR-setting rule.	2022-04-06 12:36:58 +08:00
Zengwei Yao	ceeb95bcb8	update icefall/__init__.py to import more common functions. (#294 )	2022-04-06 11:55:29 +08:00

1 2 3 4 5 ...

422 Commits