mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-10 02:22:17 +00:00
* initial commit * support download, data prep, and fbank * on-the-fly feature extraction by default * support BPE based lang * support HLG for BPE * small fix * small fix * chunked feature extraction by default * Compute features for GigaSpeech by splitting the manifest. * Fixes after review. * Split manifests into 2000 pieces. * set audio duration mismatch tolerance to 0.01 * small fix * add conformer training recipe * Add conformer.py without pre-commit checking * lazy loading and use SingleCutSampler * DynamicBucketingSampler * use KaldifeatFbank to compute fbank for musan * use pretrained language model and lexicon * use 3gram to decode, 4gram to rescore * Add decode.py * Update .flake8 * Delete compute_fbank_gigaspeech.py * Use BucketingSampler for valid and test dataloader * Update params in train.py * Use bpe_500 * update params in decode.py * Decrease num_paths while CUDA OOM * Added README * Update RESULTS * black * Decrease num_paths while CUDA OOM * Decode with post-processing * Update results * Remove lazy_load option * Use default `storage_type` * Keep the original tolerance * Use split-lazy * black * Update pretrained model Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
21 lines
741 B
Markdown
21 lines
741 B
Markdown
# GigaSpeech
|
|
GigaSpeech, an evolving, multi-domain English
|
|
speech recognition corpus with 10,000 hours of high quality labeled
|
|
audio, collected from audiobooks, podcasts
|
|
and YouTube, covering both read and spontaneous speaking styles,
|
|
and a variety of topics, such as arts, science, sports, etc. More details can be found: https://github.com/SpeechColab/GigaSpeech
|
|
|
|
## Download
|
|
|
|
Apply for the download credentials and download the dataset by following https://github.com/SpeechColab/GigaSpeech#download. Then create a symlink
|
|
```bash
|
|
ln -sfv /path/to/GigaSpeech download/GigaSpeech
|
|
```
|
|
|
|
## Performance Record
|
|
| | Dev | Test |
|
|
|-----|-------|-------|
|
|
| WER | 10.47 | 10.58 |
|
|
|
|
See [RESULTS](/egs/gigaspeech/ASR/RESULTS.md) for details.
|