9 Commits

Author SHA1 Message Date
Fangjun Kuang
dbda1644b5
Replace load_manifest_lazy with load_manifest for MUSAN. (#412) 2022-06-09 11:42:18 +08:00
Fangjun Kuang
f1abce72f8
Use jsonl for CutSet in the LibriSpeech recipe. (#397)
* Use jsonl for cutsets in the librispeech recipe.

* Use lazy cutset for all recipes.

* More fixes to use lazy CutSet.

* Remove force=True from logging to support Python < 3.8

* Minor fixes.

* Fix style issues.
2022-06-06 10:19:16 +08:00
Ewald Enzinger
8c5722de8c
[egs] Add prefix when reading manifests due to recent lhotse changes (#382)
* [egs] Add prefix when reading manifests due to recent lhotse changes

* Fix wenetspeech

* Fix style issues
2022-05-23 23:37:35 +08:00
Daniel Povey
4e23fb2252
Improve diagnostics code memory-wise and accumulate more stats. (#373)
* Update diagnostics, hopefully print more stats.

# Conflicts:
#	egs/librispeech/ASR/pruned_transducer_stateless4b/train.py

* Remove memory-limit options arg

* Remove unnecessary option for diagnostics code, collect on more batches
2022-05-19 11:45:59 +08:00
Guanbo Wang
9630f9a3ba
Update GigaSpeech reults (#364)
* Update decode.py

* Update export.py

* Update results

* Update README.md
2022-05-15 12:57:40 +08:00
Fangjun Kuang
e30e042c39
Update decoding script for gigaspeech and remove duplicate files. (#361) 2022-05-13 13:03:16 +08:00
Guanbo Wang
48a6a9a549
GigaSpeech RNN-T experiments (#318)
* Copy RNN-T recipe from librispeech

* flake8

* flake8

* Update params

* gigaspeech decode

* black

* Update results

* syntax highlight

* Update RESULTS.md

* typo
2022-05-13 11:03:26 +08:00
Guanbo Wang
8e3c89076e Bug fix (#352) 2022-05-07 08:10:54 +08:00
Wang, Guanbo
5fe58de43c
GigaSpeech recipe (#120)
* initial commit

* support download, data prep, and fbank

* on-the-fly feature extraction by default

* support BPE based lang

* support HLG for BPE

* small fix

* small fix

* chunked feature extraction by default

* Compute features for GigaSpeech by splitting the manifest.

* Fixes after review.

* Split manifests into 2000 pieces.

* set audio duration mismatch tolerance to 0.01

* small fix

* add conformer training recipe

* Add conformer.py without pre-commit checking

* lazy loading and use SingleCutSampler

* DynamicBucketingSampler

* use KaldifeatFbank to compute fbank for musan

* use pretrained language model and lexicon

* use 3gram to decode, 4gram to rescore

* Add decode.py

* Update .flake8

* Delete compute_fbank_gigaspeech.py

* Use BucketingSampler for valid and test dataloader

* Update params in train.py

* Use bpe_500

* update params in decode.py

* Decrease num_paths while CUDA OOM

* Added README

* Update RESULTS

* black

* Decrease num_paths while CUDA OOM

* Decode with post-processing

* Update results

* Remove lazy_load option

* Use default `storage_type`

* Keep the original tolerance

* Use split-lazy

* black

* Update pretrained model

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-04-14 16:07:22 +08:00