diff --git a/docs/source/recipes/TTS/ljspeech/vits.rst b/docs/source/recipes/TTS/ljspeech/vits.rst
index 323d0adfc..d31bf6302 100644
--- a/docs/source/recipes/TTS/ljspeech/vits.rst
+++ b/docs/source/recipes/TTS/ljspeech/vits.rst
@@ -56,7 +56,8 @@ Training
       --start-epoch 1 \
       --use-fp16 1 \
       --exp-dir vits/exp \
-      --tokens data/tokens.txt
+      --tokens data/tokens.txt \
+      --model-type high \
       --max-duration 500
 
 .. note::
@@ -64,6 +65,11 @@ Training
     You can adjust the hyper-parameters to control the size of the VITS model and
     the training configurations. For more details, please run ``./vits/train.py --help``.
 
+.. warning::
+
+   If you want a model that runs faster on CPU, please use ``--model-type low``
+   or ``--model-type medium``.
+
 .. note::
 
     The training can take a long time (usually a couple of days).
@@ -95,8 +101,8 @@ training part first. It will save the ground-truth and generated wavs to the dir
 Export models
 -------------
 
-Currently we only support ONNX model exporting. It will generate two files in the given ``exp-dir``:
-``vits-epoch-*.onnx`` and ``vits-epoch-*.int8.onnx``.
+Currently we only support ONNX model exporting. It will generate one file in the given ``exp-dir``:
+``vits-epoch-*.onnx``.
 
 .. code-block:: bash
 
@@ -120,4 +126,7 @@ Download pretrained models
 If you don't want to train from scratch, you can download the pretrained models
 by visiting the following link:
 
-  - `<https://huggingface.co/Zengwei/icefall-tts-ljspeech-vits-2024-02-28>`_
+  - ``--model-type=high``: `<https://huggingface.co/Zengwei/icefall-tts-ljspeech-vits-2024-02-28>`_
+  - ``--model-type=medium``: `<https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-medium-2024-03-12>`_
+  - ``--model-type=low``: `<https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-low-2024-03-12>`_
+
diff --git a/egs/ljspeech/TTS/README.md b/egs/ljspeech/TTS/README.md
index 8b28193ca..7b112c12c 100644
--- a/egs/ljspeech/TTS/README.md
+++ b/egs/ljspeech/TTS/README.md
@@ -43,7 +43,7 @@ If you feel that the trained model is slow at runtime, you can specify the
 argument `--model-type` during training. Possible values are:
 
   - `low`, means **low** quality. The resulting model is very small in file size
-    and runs very fast. The following is a wave file generatd by a `low` model
+    and runs very fast. The following is a wave file generatd by a `low` quality model
 
     https://github.com/k2-fsa/icefall/assets/5284924/d5758c24-470d-40ee-b089-e57fcba81633
 
@@ -52,15 +52,24 @@ argument `--model-type` during training. Possible values are:
     The exported onnx model has a file size of ``26.8 MB`` (float32).
 
   - `medium`, means **medium** quality.
-    The following is a wave file generatd by a `medium` model
+    The following is a wave file generatd by a `medium` quality model
 
     https://github.com/k2-fsa/icefall/assets/5284924/b199d960-3665-4d0d-9ae9-a1bb69cbc8ac
 
     The text is `Ask not what your country can do for you; ask what you can do for your country.`
 
-    The exported onnx model has file size of ``70.9 MB`` (float32).
+    The exported onnx model has a file size of ``70.9 MB`` (float32).
+
+  - `high`, means **high** quality. This is the default value.
+
+    The following is a wave file generatd by a `high` quality model
+
+    https://github.com/k2-fsa/icefall/assets/5284924/b39f3048-73a6-4267-bf95-df5abfdb28fc
+
+    The text is `Ask not what your country can do for you; ask what you can do for your country.`
+
+    The exported onnx model has a file size of ``113 MB`` (float32).
 
-  - `high`, means **high** quality
 
 A pre-trained `low` model trained using 4xV100 32GB GPU with the following command can be found at
 <https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-low-2024-03-12>