+VITS
+This tutorial shows you how to train an VITS model
+with the LJSpeech dataset.
+
+
+Data preparation
+$ cd egs/ljspeech/TTS
+$ ./prepare.sh
+
+
+To run stage 1 to stage 5, use
+$ ./prepare.sh --stage 1 --stop_stage 5
+
+
+
+
+Build Monotonic Alignment Search
+$ cd vits/monotonic_align
+$ python setup.py build_ext --inplace
+$ cd ../../
+
+
+
+
+Training
+$ export CUDA_VISIBLE_DEVICES="0,1,2,3"
+$ ./vits/train.py \
+ --world-size 4 \
+ --num-epochs 1000 \
+ --start-epoch 1 \
+ --use-fp16 1 \
+ --exp-dir vits/exp \
+ --tokens data/tokens.txt
+ --max-duration 500
+
+
+
+
Note
+
You can adjust the hyper-parameters to control the size of the VITS model and
+the training configurations. For more details, please run ./vits/train.py --help
.
+
+
+
Note
+
The training can take a long time (usually a couple of days).
+
+Training logs, checkpoints and tensorboard logs are saved in vits/exp
.
+
+
+Inference
+The inference part uses checkpoints saved by the training part, so you have to run the
+training part first. It will save the ground-truth and generated wavs to the directory
+vits/exp/infer/epoch-*/wav
, e.g., vits/exp/infer/epoch-1000/wav
.
+$ export CUDA_VISIBLE_DEVICES="0"
+$ ./vits/infer.py \
+ --epoch 1000 \
+ --exp-dir vits/exp \
+ --tokens data/tokens.txt
+ --max-duration 500
+
+
+
+
Note
+
For more details, please run ./vits/infer.py --help
.
+
+
+
+Export models
+Currently we only support ONNX model exporting. It will generate two files in the given exp-dir
:
+vits-epoch-*.onnx
and vits-epoch-*.int8.onnx
.
+$ ./vits/export-onnx.py \
+ --epoch 1000 \
+ --exp-dir vits/exp \
+ --tokens data/tokens.txt
+
+
+You can test the exported ONNX model with:
+$ ./vits/test_onnx.py \
+ --model-filename vits/exp/vits-epoch-1000.onnx \
+ --tokens data/tokens.txt
+
+
+
+
+Download pretrained models
+If you don’t want to train from scratch, you can download the pretrained models
+by visiting the following link:
+
+
+
+