icefall/egs/ljspeech/TTS/README.md
2024-02-05 12:41:19 +08:00

17 lines
772 B
Markdown

# Introduction
This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books.
A transcription is provided for each clip.
Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.
The texts were published between 1884 and 1964, and are in the public domain.
The audio was recorded in 2016-17 by the [LibriVox](https://librivox.org/) project and is also in the public domain.
The above information is from the [LJSpeech website](https://keithito.com/LJ-Speech-Dataset/).
# VITS
This recipe provides a VITS model trained on the LJSpeech dataset.
Pretrained model can be found [here](https://huggingface.co/Zengwei/icefall-tts-ljspeech-vits-2023-11-29).