Added README

This commit is contained in:
Guanbo Wang 2022-04-06 20:30:09 -04:00
parent a4e1471d1d
commit f857d5a9ea
2 changed files with 19 additions and 1 deletions

View File

@ -0,0 +1,18 @@
# GigaSpeech
GigaSpeech, an evolving, multi-domain English
speech recognition corpus with 10,000 hours of high quality labeled
audio, collected from audiobooks, podcasts
and YouTube, covering both read and spontaneous speaking styles,
and a variety of topics, such as arts, science, sports, etc. More details can be found: https://github.com/SpeechColab/GigaSpeech
## Download
Apply for the download credentials and download the dataset by following https://github.com/SpeechColab/GigaSpeech#download. Then create a symlink
```bash
ln -sfv /path/to/GigaSpeech download/GigaSpeech
```
## Performance Record
| |Dev|Test|
|---|---|---|
|WER |11.92|11.85|

View File

@ -81,7 +81,7 @@ if [ $stage -le 0 ] && [ $stop_stage -ge 0 ]; then
# Check credentials.
if [ ! -f $dl_dir/password ]; then
echo -n "$0: Please apply for the download credentials by following"
echo -n "https://github.com/SpeechColab/GigaSpeech#dataset-download"
echo -n "https://github.com/SpeechColab/GigaSpeech#download"
echo " and save it to $dl_dir/password."
exit 1;
fi