mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-09 10:02:22 +00:00
deploy: 15bd9a841e347a8881fc6df599fd440ebb118da4
This commit is contained in:
parent
05b3381bce
commit
c1b715f7df
@ -13,6 +13,14 @@ with the `LJSpeech <https://keithito.com/LJ-Speech-Dataset/>`_ dataset.
|
|||||||
The VITS paper: `Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech <https://arxiv.org/pdf/2106.06103.pdf>`_
|
The VITS paper: `Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech <https://arxiv.org/pdf/2106.06103.pdf>`_
|
||||||
|
|
||||||
|
|
||||||
|
Install extra dependencies
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html
|
||||||
|
pip install numba espnet_tts_frontend
|
||||||
|
|
||||||
Data preparation
|
Data preparation
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
@ -130,3 +138,64 @@ by visiting the following link:
|
|||||||
- ``--model-type=medium``: `<https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-medium-2024-03-12>`_
|
- ``--model-type=medium``: `<https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-medium-2024-03-12>`_
|
||||||
- ``--model-type=low``: `<https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-low-2024-03-12>`_
|
- ``--model-type=low``: `<https://huggingface.co/csukuangfj/icefall-tts-ljspeech-vits-low-2024-03-12>`_
|
||||||
|
|
||||||
|
Usage in sherpa-onnx
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
The following describes how to test the exported ONNX model in `sherpa-onnx`_.
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
`sherpa-onnx`_ supports different programming languages, e.g., C++, C, Python,
|
||||||
|
Kotlin, Java, Swift, Go, C#, etc. It also supports Android and iOS.
|
||||||
|
|
||||||
|
We only describe how to use pre-built binaries from `sherpa-onnx`_ below.
|
||||||
|
Please refer to `<https://k2-fsa.github.io/sherpa/onnx/>`_
|
||||||
|
for more documentation.
|
||||||
|
|
||||||
|
Install sherpa-onnx
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
pip install sherpa-onnx
|
||||||
|
|
||||||
|
To check that you have installed `sherpa-onnx`_ successfully, please run:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
which sherpa-onnx-offline-tts
|
||||||
|
sherpa-onnx-offline-tts --help
|
||||||
|
|
||||||
|
Download lexicon files
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
cd /tmp
|
||||||
|
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2
|
||||||
|
tar xf espeak-ng-data.tar.bz2
|
||||||
|
|
||||||
|
Run sherpa-onnx
|
||||||
|
^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
cd egs/ljspeech/TTS
|
||||||
|
|
||||||
|
sherpa-onnx-offline-tts \
|
||||||
|
--vits-model=vits/exp/vits-epoch-1000.onnx \
|
||||||
|
--vits-tokens=data/tokens.txt \
|
||||||
|
--vits-data-dir=/tmp/espeak-ng-data \
|
||||||
|
--num-threads=1 \
|
||||||
|
--output-filename=./high.wav \
|
||||||
|
"Ask not what your country can do for you; ask what you can do for your country."
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
You can also use ``sherpa-onnx-offline-tts-play`` to play the audio
|
||||||
|
as it is generating.
|
||||||
|
|
||||||
|
You should get a file ``high.wav`` after running the above command.
|
||||||
|
|
||||||
|
Congratulations! You have successfully trained and exported a text-to-speech
|
||||||
|
model and run it with `sherpa-onnx`_.
|
||||||
|
@ -104,12 +104,14 @@
|
|||||||
<div class="toctree-wrapper compound">
|
<div class="toctree-wrapper compound">
|
||||||
<ul>
|
<ul>
|
||||||
<li class="toctree-l1"><a class="reference internal" href="ljspeech/vits.html">VITS-LJSpeech</a><ul>
|
<li class="toctree-l1"><a class="reference internal" href="ljspeech/vits.html">VITS-LJSpeech</a><ul>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#install-extra-dependencies">Install extra dependencies</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#data-preparation">Data preparation</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#data-preparation">Data preparation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#build-monotonic-alignment-search">Build Monotonic Alignment Search</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#build-monotonic-alignment-search">Build Monotonic Alignment Search</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#training">Training</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#training">Training</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#inference">Inference</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#inference">Inference</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#export-models">Export models</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#export-models">Export models</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#download-pretrained-models">Download pretrained models</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#download-pretrained-models">Download pretrained models</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="ljspeech/vits.html#usage-in-sherpa-onnx">Usage in sherpa-onnx</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l1"><a class="reference internal" href="vctk/vits.html">VITS-VCTK</a><ul>
|
<li class="toctree-l1"><a class="reference internal" href="vctk/vits.html">VITS-VCTK</a><ul>
|
||||||
|
@ -59,12 +59,14 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
|
||||||
<li class="toctree-l2 current"><a class="reference internal" href="../index.html">TTS</a><ul class="current">
|
<li class="toctree-l2 current"><a class="reference internal" href="../index.html">TTS</a><ul class="current">
|
||||||
<li class="toctree-l3 current"><a class="current reference internal" href="#">VITS-LJSpeech</a><ul>
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">VITS-LJSpeech</a><ul>
|
||||||
|
<li class="toctree-l4"><a class="reference internal" href="#install-extra-dependencies">Install extra dependencies</a></li>
|
||||||
<li class="toctree-l4"><a class="reference internal" href="#data-preparation">Data preparation</a></li>
|
<li class="toctree-l4"><a class="reference internal" href="#data-preparation">Data preparation</a></li>
|
||||||
<li class="toctree-l4"><a class="reference internal" href="#build-monotonic-alignment-search">Build Monotonic Alignment Search</a></li>
|
<li class="toctree-l4"><a class="reference internal" href="#build-monotonic-alignment-search">Build Monotonic Alignment Search</a></li>
|
||||||
<li class="toctree-l4"><a class="reference internal" href="#training">Training</a></li>
|
<li class="toctree-l4"><a class="reference internal" href="#training">Training</a></li>
|
||||||
<li class="toctree-l4"><a class="reference internal" href="#inference">Inference</a></li>
|
<li class="toctree-l4"><a class="reference internal" href="#inference">Inference</a></li>
|
||||||
<li class="toctree-l4"><a class="reference internal" href="#export-models">Export models</a></li>
|
<li class="toctree-l4"><a class="reference internal" href="#export-models">Export models</a></li>
|
||||||
<li class="toctree-l4"><a class="reference internal" href="#download-pretrained-models">Download pretrained models</a></li>
|
<li class="toctree-l4"><a class="reference internal" href="#download-pretrained-models">Download pretrained models</a></li>
|
||||||
|
<li class="toctree-l4"><a class="reference internal" href="#usage-in-sherpa-onnx">Usage in sherpa-onnx</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l3"><a class="reference internal" href="../vctk/vits.html">VITS-VCTK</a></li>
|
<li class="toctree-l3"><a class="reference internal" href="../vctk/vits.html">VITS-VCTK</a></li>
|
||||||
@ -120,6 +122,13 @@ with the <a class="reference external" href="https://keithito.com/LJ-Speech-Data
|
|||||||
<p class="admonition-title">Note</p>
|
<p class="admonition-title">Note</p>
|
||||||
<p>The VITS paper: <a class="reference external" href="https://arxiv.org/pdf/2106.06103.pdf">Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech</a></p>
|
<p>The VITS paper: <a class="reference external" href="https://arxiv.org/pdf/2106.06103.pdf">Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech</a></p>
|
||||||
</div>
|
</div>
|
||||||
|
<section id="install-extra-dependencies">
|
||||||
|
<h2>Install extra dependencies<a class="headerlink" href="#install-extra-dependencies" title="Permalink to this heading"></a></h2>
|
||||||
|
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span>piper_phonemize<span class="w"> </span>-f<span class="w"> </span>https://k2-fsa.github.io/icefall/piper_phonemize.html
|
||||||
|
pip<span class="w"> </span>install<span class="w"> </span>numba<span class="w"> </span>espnet_tts_frontend
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
<section id="data-preparation">
|
<section id="data-preparation">
|
||||||
<h2>Data preparation<a class="headerlink" href="#data-preparation" title="Permalink to this heading"></a></h2>
|
<h2>Data preparation<a class="headerlink" href="#data-preparation" title="Permalink to this heading"></a></h2>
|
||||||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>$<span class="w"> </span><span class="nb">cd</span><span class="w"> </span>egs/ljspeech/TTS
|
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>$<span class="w"> </span><span class="nb">cd</span><span class="w"> </span>egs/ljspeech/TTS
|
||||||
@ -220,6 +229,59 @@ by visiting the following link:</p>
|
|||||||
</ul>
|
</ul>
|
||||||
</div></blockquote>
|
</div></blockquote>
|
||||||
</section>
|
</section>
|
||||||
|
<section id="usage-in-sherpa-onnx">
|
||||||
|
<h2>Usage in sherpa-onnx<a class="headerlink" href="#usage-in-sherpa-onnx" title="Permalink to this heading"></a></h2>
|
||||||
|
<p>The following describes how to test the exported ONNX model in <a class="reference external" href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a>.</p>
|
||||||
|
<div class="admonition hint">
|
||||||
|
<p class="admonition-title">Hint</p>
|
||||||
|
<p><a class="reference external" href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a> supports different programming languages, e.g., C++, C, Python,
|
||||||
|
Kotlin, Java, Swift, Go, C#, etc. It also supports Android and iOS.</p>
|
||||||
|
<p>We only describe how to use pre-built binaries from <a class="reference external" href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a> below.
|
||||||
|
Please refer to <a class="reference external" href="https://k2-fsa.github.io/sherpa/onnx/">https://k2-fsa.github.io/sherpa/onnx/</a>
|
||||||
|
for more documentation.</p>
|
||||||
|
</div>
|
||||||
|
<section id="install-sherpa-onnx">
|
||||||
|
<h3>Install sherpa-onnx<a class="headerlink" href="#install-sherpa-onnx" title="Permalink to this heading"></a></h3>
|
||||||
|
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span>sherpa-onnx
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
<p>To check that you have installed <a class="reference external" href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a> successfully, please run:</p>
|
||||||
|
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>which<span class="w"> </span>sherpa-onnx-offline-tts
|
||||||
|
sherpa-onnx-offline-tts<span class="w"> </span>--help
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
<section id="download-lexicon-files">
|
||||||
|
<h3>Download lexicon files<a class="headerlink" href="#download-lexicon-files" title="Permalink to this heading"></a></h3>
|
||||||
|
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nb">cd</span><span class="w"> </span>/tmp
|
||||||
|
wget<span class="w"> </span>https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2
|
||||||
|
tar<span class="w"> </span>xf<span class="w"> </span>espeak-ng-data.tar.bz2
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
<section id="run-sherpa-onnx">
|
||||||
|
<h3>Run sherpa-onnx<a class="headerlink" href="#run-sherpa-onnx" title="Permalink to this heading"></a></h3>
|
||||||
|
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nb">cd</span><span class="w"> </span>egs/ljspeech/TTS
|
||||||
|
|
||||||
|
sherpa-onnx-offline-tts<span class="w"> </span><span class="se">\</span>
|
||||||
|
<span class="w"> </span>--vits-model<span class="o">=</span>vits/exp/vits-epoch-1000.onnx<span class="w"> </span><span class="se">\</span>
|
||||||
|
<span class="w"> </span>--vits-tokens<span class="o">=</span>data/tokens.txt<span class="w"> </span><span class="se">\</span>
|
||||||
|
<span class="w"> </span>--vits-data-dir<span class="o">=</span>/tmp/espeak-ng-data<span class="w"> </span><span class="se">\</span>
|
||||||
|
<span class="w"> </span>--num-threads<span class="o">=</span><span class="m">1</span><span class="w"> </span><span class="se">\</span>
|
||||||
|
<span class="w"> </span>--output-filename<span class="o">=</span>./high.wav<span class="w"> </span><span class="se">\</span>
|
||||||
|
<span class="w"> </span><span class="s2">"Ask not what your country can do for you; ask what you can do for your country."</span>
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
<div class="admonition hint">
|
||||||
|
<p class="admonition-title">Hint</p>
|
||||||
|
<p>You can also use <code class="docutils literal notranslate"><span class="pre">sherpa-onnx-offline-tts-play</span></code> to play the audio
|
||||||
|
as it is generating.</p>
|
||||||
|
</div>
|
||||||
|
<p>You should get a file <code class="docutils literal notranslate"><span class="pre">high.wav</span></code> after running the above command.</p>
|
||||||
|
<p>Congratulations! You have successfully trained and exported a text-to-speech
|
||||||
|
model and run it with <a class="reference external" href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a>.</p>
|
||||||
|
</section>
|
||||||
|
</section>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
|
||||||
|
File diff suppressed because one or more lines are too long
Loading…
x
Reference in New Issue
Block a user