diff --git a/_sources/recipes/TTS/index.rst.txt b/_sources/recipes/TTS/index.rst.txt new file mode 100644 index 000000000..aa891c072 --- /dev/null +++ b/_sources/recipes/TTS/index.rst.txt @@ -0,0 +1,7 @@ +TTS +====== + +.. toctree:: + :maxdepth: 2 + + ljspeech/vits diff --git a/_sources/recipes/TTS/ljspeech/vits.rst.txt b/_sources/recipes/TTS/ljspeech/vits.rst.txt new file mode 100644 index 000000000..385fd3c70 --- /dev/null +++ b/_sources/recipes/TTS/ljspeech/vits.rst.txt @@ -0,0 +1,113 @@ +VITS +=============== + +This tutorial shows you how to train an VITS model +with the `LJSpeech `_ dataset. + +.. note:: + + The VITS paper: `Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech `_ + + +Data preparation +---------------- + +.. code-block:: bash + + $ cd egs/ljspeech/TTS + $ ./prepare.sh + +To run stage 1 to stage 5, use + +.. code-block:: bash + + $ ./prepare.sh --stage 1 --stop_stage 5 + + +Build Monotonic Alignment Search +-------------------------------- + +.. code-block:: bash + + $ cd vits/monotonic_align + $ python setup.py build_ext --inplace + $ cd ../../ + + +Training +-------- + +.. code-block:: bash + + $ export CUDA_VISIBLE_DEVICES="0,1,2,3" + $ ./vits/train.py \ + --world-size 4 \ + --num-epochs 1000 \ + --start-epoch 1 \ + --use-fp16 1 \ + --exp-dir vits/exp \ + --tokens data/tokens.txt + --max-duration 500 + +.. note:: + + You can adjust the hyper-parameters to control the size of the VITS model and + the training configurations. For more details, please run ``./vits/train.py --help``. + +.. note:: + + The training can take a long time (usually a couple of days). + +Training logs, checkpoints and tensorboard logs are saved in ``vits/exp``. + + +Inference +--------- + +The inference part uses checkpoints saved by the training part, so you have to run the +training part first. It will save the ground-truth and generated wavs to the directory +``vits/exp/infer/epoch-*/wav``, e.g., ``vits/exp/infer/epoch-1000/wav``. + +.. code-block:: bash + + $ export CUDA_VISIBLE_DEVICES="0" + $ ./vits/infer.py \ + --epoch 1000 \ + --exp-dir vits/exp \ + --tokens data/tokens.txt + --max-duration 500 + +.. note:: + + For more details, please run ``./vits/infer.py --help``. + + +Export models +------------- + +Currently we only support ONNX model exporting. It will generate two files in the given ``exp-dir``: +``vits-epoch-*.onnx`` and ``vits-epoch-*.int8.onnx``. + +.. code-block:: bash + + $ ./vits/export-onnx.py \ + --epoch 1000 \ + --exp-dir vits/exp \ + --tokens data/tokens.txt + +You can test the exported ONNX model with: + +.. code-block:: bash + + $ ./vits/test_onnx.py \ + --model-filename vits/exp/vits-epoch-1000.onnx \ + --tokens data/tokens.txt + + +Download pretrained models +-------------------------- + +If you don't want to train from scratch, you can download the pretrained models +by visiting the following link: + + - ``_ diff --git a/_sources/recipes/index.rst.txt b/_sources/recipes/index.rst.txt index 7265e1cf6..8df61f0d0 100644 --- a/_sources/recipes/index.rst.txt +++ b/_sources/recipes/index.rst.txt @@ -2,7 +2,7 @@ Recipes ======= This page contains various recipes in ``icefall``. -Currently, only speech recognition recipes are provided. +Currently, we provide recipes for speech recognition, language model, and speech synthesis. We may add recipes for other tasks as well in the future. @@ -16,3 +16,4 @@ We may add recipes for other tasks as well in the future. Non-streaming-ASR/index Streaming-ASR/index RNN-LM/index + TTS/index diff --git a/contributing/code-style.html b/contributing/code-style.html index debb240d4..74d418a75 100644 --- a/contributing/code-style.html +++ b/contributing/code-style.html @@ -1,12 +1,14 @@ - + - + Follow the code style — icefall 0.1 documentation - - + + + + diff --git a/contributing/doc.html b/contributing/doc.html index 85ebdf023..dac487b7f 100644 --- a/contributing/doc.html +++ b/contributing/doc.html @@ -1,12 +1,14 @@ - + - + Contributing to Documentation — icefall 0.1 documentation - - + + + + diff --git a/contributing/how-to-create-a-recipe.html b/contributing/how-to-create-a-recipe.html index 6443991bf..49c374276 100644 --- a/contributing/how-to-create-a-recipe.html +++ b/contributing/how-to-create-a-recipe.html @@ -1,12 +1,14 @@ - + - + How to create a recipe — icefall 0.1 documentation - - + + + + diff --git a/contributing/index.html b/contributing/index.html index 1345751f7..8645f6487 100644 --- a/contributing/index.html +++ b/contributing/index.html @@ -1,12 +1,14 @@ - + - + Contributing — icefall 0.1 documentation - - + + + + @@ -20,7 +22,7 @@ - + @@ -133,7 +135,7 @@ and code to icefall