icefall

Author	SHA1	Message	Date
yaozengwei	2a5a70e03e	Merge remote-tracking branch 'k2-fsa/master'	2022-06-13 12:52:28 +08:00
Fangjun Kuang	9f6c748b30	Add links to sherpa. (#417 ) * Add links to sherpa.	2022-06-10 12:19:18 +08:00
Fangjun Kuang	bfeab319c9	Fix aishell. (#416 )	2022-06-10 11:47:43 +08:00
Fangjun Kuang	dbda1644b5	Replace load_manifest_lazy with load_manifest for MUSAN. (#412 )	2022-06-09 11:42:18 +08:00
Fangjun Kuang	ed66877694	Replace ChunkedLilcomHdf5Writer with LilcomChunkyWriter. (#411 )	2022-06-09 11:18:52 +08:00
Quandwang	8512aaf585	fix typos (#409 )	2022-06-08 20:08:44 +08:00
Mingshuang Luo	5079d99ee2	a correction for text2segmentation.py (#407 )	2022-06-08 12:06:57 +08:00
Fangjun Kuang	1094a3cb37	Replace LilcomChunkyWriter with ChunkedLilcomHdf5Writer. (#404 )	2022-06-07 18:14:25 +08:00
Fangjun Kuang	80c46f0abd	Fix exporting emformer with torchscript using torch 1.6.0 (#402 )	2022-06-07 09:19:37 +08:00
Fangjun Kuang	29fa878fff	Fix Emformer for torchscript using torch 1.6.0 (#401 )	2022-06-06 17:08:07 +08:00
Mingshuang Luo	0a21eaae7f	do a change for decode.py (#400 )	2022-06-06 15:44:04 +08:00
Fangjun Kuang	f1abce72f8	Use jsonl for CutSet in the LibriSpeech recipe. (#397 ) * Use jsonl for cutsets in the librispeech recipe. * Use lazy cutset for all recipes. * More fixes to use lazy CutSet. * Remove force=True from logging to support Python < 3.8 * Minor fixes. * Fix style issues.	2022-06-06 10:19:16 +08:00
Mingshuang Luo	e5884f82e0	[Ready to merge] Add prefix for compute fbank (#398 ) * add prefix * add prefix	2022-06-05 18:17:52 +08:00
fanlu	8a3068ead8	Update decode.py (#392 ) * Update decode.py fix bug ```TypeError: greedy_search_batch() missing 1 required positional argument: 'encoder_out_lens'``` * fix modified_beam_search Co-authored-by: fanlu3 <fanlu@jd.com>	2022-06-04 19:08:17 +08:00
Zengwei Yao	148f69d8d9	Update RESULTS.md (#388 ) * update RESULT.md about pruned_transducer_stateless4 * Update RESULT.md This PR is only to update RESULT.md about pruned_transducer_stateless4. * set default value of --use-averaged-model to True * update RESULTS.md and add decode command * minor fix * update export.py * add uploaded files links * update link * fix typos	2022-06-04 15:52:35 +08:00
Mingshuang Luo	beab229fd7	[Ready to merge] Pruned_transducer_stateless2 for alimeeting dataset (#378 ) * add pruned-rnnt2 recipe for alimeeting dataset * update code for merging * change LilcomHdf5Writer to ChunkedLilcomHdf5Writer * change for test.yml * change for test.yml * change for test.yml * change for workflow yml * change for yml * change for yml * change for README.md * change for yml * solve the conflicts * solve the conflicts	2022-06-04 13:47:46 +08:00
Fangjun Kuang	fbfc98f1d3	Add streaming Emformer stateless RNN-T. (#390 ) * Add streaming Emformer stateless RNN-T. * Update results for streaming Emformer. * Minor fixes.	2022-06-01 14:31:47 +08:00
yaozengwei	bb7ea3141b	Merge remote-tracking branch 'k2-fsa/master'	2022-05-31 13:34:23 +08:00
LIyong.Guo	c4ee2bc0af	[Ready to merge]stateless6: states4 + hubert distillation. (#387 ) * a copy of stateless4 as base * distillation with hubert * fix typo * example usage * usage * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * fix comment * add results of 100hours * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * check fairseq and quantization * a short intro to distillation framework * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * add intro of statless6 in README * fix type error of dst_manifest_dir * Update egs/librispeech/ASR/pruned_transducer_stateless6/hubert_xlarge.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * make export.py call stateless6/train.py instead of stateless2/train.py * update results by stateless6 * adjust results format * fix typo Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2022-05-28 12:37:50 +08:00
yaozengwei	545316636b	Merge remote-tracking branch 'origin/master'	2022-05-26 21:55:56 +08:00
yaozengwei	fbbc24f941	Merge remote-tracking branch 'k2-fsa/master'	2022-05-26 21:54:40 +08:00
Mingshuang Luo	c8c8645081	[Ready to merge] Pruned-transducer-stateless2 recipe for aidatatang_200zh (#375 ) * add pruned-rnnt2 model for aidatatang_200zh * do some changes * change for README.md * do some changes	2022-05-24 23:07:40 +08:00
Ewald Enzinger	8c5722de8c	[egs] Add prefix when reading manifests due to recent lhotse changes (#382 ) * [egs] Add prefix when reading manifests due to recent lhotse changes * Fix wenetspeech * Fix style issues	2022-05-23 23:37:35 +08:00
Mingshuang Luo	0e57b30495	[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) (#349 ) * add char-based pruned-rnnt2 for wenetspeech * style check * style check * change for export.py * do some changes * do some changes * a small change for .flake8 * solve the conflicts	2022-05-23 17:13:01 +08:00
Fangjun Kuang	2f1e23cde1	Narrower and deeper conformer (#330 ) * Copy files for editing. * Add random combine from #229. * Minor fixes. * Pass model parameters from the command line. * Fix warnings. * Fix warnings. * Update readme. * Rename to avoid conflicts. * Update results. * Add CI for pruned_transducer_stateless5 * Typo fixes. * Remove random combiner. * Update decode.py and train.py to use periodically averaged models. * Minor fixes. * Revert to use random combiner. * Update results. * Minor fixes.	2022-05-23 14:39:11 +08:00
Mingshuang Luo	ec5a112831	[Ready to merge] Do some coding style checks for the latest files (#379 ) * style check * do changes for .flake8 * a change for compute_fbank_yesno.py	2022-05-20 19:30:38 +08:00
Daniel Povey	2900ed8f8f	Merge pull request #376 from danpovey/diagnostics_fix Diagnostics fix	2022-05-19 12:51:07 +08:00
Daniel Povey	9e88d0bf31	Merge remote-tracking branch 'upstream/master'	2022-05-19 12:49:12 +08:00
Daniel Povey	5230e73e41	Small fixes	2022-05-19 12:49:00 +08:00
Daniel Povey	4e23fb2252	Improve diagnostics code memory-wise and accumulate more stats. (#373 ) * Update diagnostics, hopefully print more stats. # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless4b/train.py * Remove memory-limit options arg * Remove unnecessary option for diagnostics code, collect on more batches	2022-05-19 11:45:59 +08:00
Daniel Povey	c736b39c7d	Remove unnecessary option for diagnostics code, collect on more batches	2022-05-19 11:35:54 +08:00
Daniel Povey	c0fdfabaf3	Remove memory-limit options arg	2022-05-19 11:30:56 +08:00
Daniel Povey	c2c46ea023	Update diagnostics, hopefully print more stats. # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless4b/train.py	2022-05-19 11:29:31 +08:00
Fangjun Kuang	f6ce135608	Various fixes to support torch script. (#371 ) * Various fixes to support torch script. * Add tests to ensure that the model is torch scriptable. * Update tests.	2022-05-16 21:46:59 +08:00
Desh Raj	5aafbb970e	SPGISpeech recipe (#334 ) * initial commit for SPGISpeech recipe * add decoding * add spgispeech transducer * remove conformer ctc; minor fixes in RNN-T * add results * add tensorboard * add pretrained model to HF * remove unused scripts and soft link common scripts * remove duplicate files * pre commit hooks * remove change in librispeech * pre commit hook * add CER numbers	2022-05-16 20:52:14 +08:00
yaozengwei	c9d84aeb5c	Merge remote-tracking branch 'k2-fsa/master'	2022-05-15 18:02:27 +08:00
Fangjun Kuang	6f7860a0a6	Fix GitHub CI for decoding GigaSpeech dev/test datasets (#366 )	2022-05-15 14:25:35 +08:00
Guanbo Wang	9630f9a3ba	Update GigaSpeech reults (#364 ) * Update decode.py * Update export.py * Update results * Update README.md	2022-05-15 12:57:40 +08:00
Fangjun Kuang	f23dd43719	Update results for libri+giga multi dataset setup. (#363 ) * Update results for libri+giga multi dataset setup.	2022-05-14 21:45:39 +08:00
Fangjun Kuang	2d7096dfc6	Decode gigaspeech in GitHub actions (#362 ) * Add CI for gigaspeech.	2022-05-14 08:53:22 +08:00
Fangjun Kuang	0f180b3ce2	Validate that there are no OOV tokens in BPE-based lexicons. (#359 ) * Validate that there are no OOV tokens in BPE-based lexicons. * Typo fixes.	2022-05-13 14:00:35 +08:00
Fangjun Kuang	e30e042c39	Update decoding script for gigaspeech and remove duplicate files. (#361 )	2022-05-13 13:03:16 +08:00
Guanbo Wang	48a6a9a549	GigaSpeech RNN-T experiments (#318 ) * Copy RNN-T recipe from librispeech * flake8 * flake8 * Update params * gigaspeech decode * black * Update results * syntax highlight * Update RESULTS.md * typo	2022-05-13 11:03:26 +08:00
Fangjun Kuang	7b7acdf369	Support --iter in export.py (#360 )	2022-05-13 10:51:44 +08:00
Fangjun Kuang	aeb8986e35	Ignore padding frames during RNN-T decoding. (#358 ) * Ignore padding frames during RNN-T decoding. * Fix outdated decoding code. * Minor fixes.	2022-05-13 07:39:14 +08:00
yaozengwei	bcef517a84	Merge remote-tracking branch 'k2-fsa/master'	2022-05-12 17:45:45 +08:00
Fangjun Kuang	bc284e88e6	Run decode.py in GitHub actions. (#356 )	2022-05-10 14:51:34 +08:00
Fangjun Kuang	cd460f7bf1	Stringify torch.__version__ before serializing it. (#354 )	2022-05-07 17:18:34 +08:00
Zengwei Yao	20f092e709	Support decoding with averaged model when using --iter (#353 ) * support decoding with averaged model when using --iter * minor fix * monir fix of copyright date	2022-05-07 13:09:11 +08:00
Mingshuang Luo	f783e10dc8	Do some changes for aishell/ASR/transducer stateless/export.py (#347 ) * do some changes for aishell/ASR/transducer_stateless/export.py	2022-05-07 11:09:31 +08:00

1 2 3 4 5 ...

484 Commits