deploy: 13daf73468da70f5db49c928969fce6c6edc041a

This commit is contained in:
marcoyang1998 2024-02-21 20:16:35 +00:00
parent e5fed5060b
commit a90f45427e
38 changed files with 620 additions and 5 deletions

View File

@ -0,0 +1,140 @@
Finetune from a supervised pre-trained Zipformer model
======================================================
This tutorial shows you how to fine-tune a supervised pre-trained **Zipformer**
transducer model on a new dataset.
.. HINT::
We assume you have read the page :ref:`install icefall` and have setup
the environment for ``icefall``.
.. HINT::
We recommend you to use a GPU or several GPUs to run this recipe
For illustration purpose, we fine-tune the Zipformer transducer model
pre-trained on `LibriSpeech`_ on the small subset of `GigaSpeech`_. You could use your
own data for fine-tuning if you create a manifest for your new dataset.
Data preparation
----------------
Please follow the instructions in the `GigaSpeech recipe <https://github.com/k2-fsa/icefall/tree/master/egs/gigaspeech/ASR>`_
to prepare the fine-tune data used in this tutorial. We only require the small subset in GigaSpeech for this tutorial.
Model preparation
-----------------
We are using the Zipformer model trained on full LibriSpeech (960 hours) as the intialization. The
checkpoint of the model can be downloaded via the following command:
.. code-block:: bash
$ GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-2023-05-15
$ cd icefall-asr-librispeech-zipformer-2023-05-15/exp
$ git lfs pull --include "pretrained.pt"
$ ln -s pretrained.pt epoch-99.pt
$ cd ../data/lang_bpe_500
$ git lfs pull --include bpe.model
$ cd ../../..
Before fine-tuning, let's test the model's WER on the new domain. The following command performs
decoding on the GigaSpeech test sets:
.. code-block:: bash
./zipformer/decode_gigaspeech.py \
--epoch 99 \
--avg 1 \
--exp-dir icefall-asr-librispeech-zipformer-2023-05-15/exp \
--use-averaged-model 0 \
--max-duration 1000 \
--decoding-method greedy_search
You should see the following numbers:
.. code-block::
For dev, WER of different settings are:
greedy_search 20.06 best for dev
For test, WER of different settings are:
greedy_search 19.27 best for test
Fine-tune
---------
Since LibriSpeech and GigaSpeech are both English dataset, we can initialize the whole
Zipformer model with the checkpoint downloaded in the previous step (otherwise we should consider
initializing the stateless decoder and joiner from scratch due to the mismatch of the output
vocabulary). The following command starts a fine-tuning experiment:
.. code-block:: bash
$ use_mux=0
$ do_finetune=1
$ ./zipformer/finetune.py \
--world-size 2 \
--num-epochs 20 \
--start-epoch 1 \
--exp-dir zipformer/exp_giga_finetune${do_finetune}_mux${use_mux} \
--use-fp16 1 \
--base-lr 0.0045 \
--bpe-model data/lang_bpe_500/bpe.model \
--do-finetune $do_finetune \
--use-mux $use_mux \
--master-port 13024 \
--finetune-ckpt icefall-asr-librispeech-zipformer-2023-05-15/exp/pretrained.pt \
--max-duration 1000
The following arguments are related to fine-tuning:
- ``--base-lr``
The learning rate used for fine-tuning. We suggest to set a **small** learning rate for fine-tuning,
otherwise the model may forget the initialization very quickly. A reasonable value should be around
1/10 of the original lr, i.e 0.0045.
- ``--do-finetune``
If True, do fine-tuning by initializing the model from a pre-trained checkpoint.
**Note that if you want to resume your fine-tuning experiment from certain epochs, you
need to set this to False.**
- ``--finetune-ckpt``
The path to the pre-trained checkpoint (used for initialization).
- ``--use-mux``
If True, mix the fine-tune data with the original training data by using `CutSet.mux <https://lhotse.readthedocs.io/en/latest/api.html#lhotse.supervision.SupervisionSet.mux>`_
This helps maintain the model's performance on the original domain if the original training
is available. **If you don't have the original training data, please set it to False.**
After fine-tuning, let's test the WERs. You can do this via the following command:
.. code-block:: bash
$ use_mux=0
$ do_finetune=1
$ ./zipformer/decode_gigaspeech.py \
--epoch 20 \
--avg 10 \
--exp-dir zipformer/exp_giga_finetune${do_finetune}_mux${use_mux} \
--use-averaged-model 1 \
--max-duration 1000 \
--decoding-method greedy_search
You should see numbers similar to the ones below:
.. code-block:: text
For dev, WER of different settings are:
greedy_search 13.47 best for dev
For test, WER of different settings are:
greedy_search 13.66 best for test
Compared to the original checkpoint, the fine-tuned model achieves much lower WERs
on the GigaSpeech test sets.

View File

@ -0,0 +1,15 @@
Fine-tune a pre-trained model
=============================
After pre-training on public available datasets, the ASR model is already capable of
performing general speech recognition with relatively high accuracy. However, the accuracy
could be still low on certain domains that are quite different from the original training
set. In this case, we can fine-tune the model with a small amount of additional labelled
data to improve the performance on new domains.
.. toctree::
:maxdepth: 2
:caption: Table of Contents
from_supervised/finetune_zipformer

View File

@ -17,3 +17,4 @@ We may add recipes for other tasks as well in the future.
Streaming-ASR/index Streaming-ASR/index
RNN-LM/index RNN-LM/index
TTS/index TTS/index
Finetune/index

View File

@ -22,7 +22,7 @@
<link rel="index" title="Index" href="../genindex.html" /> <link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" /> <link rel="search" title="Search" href="../search.html" />
<link rel="next" title="Contributing to Documentation" href="doc.html" /> <link rel="next" title="Contributing to Documentation" href="doc.html" />
<link rel="prev" title="VITS" href="../recipes/TTS/vctk/vits.html" /> <link rel="prev" title="Finetune from a supervised pre-trained Zipformer model" href="../recipes/Finetune/from_supervised/finetune_zipformer.html" />
</head> </head>
<body class="wy-body-for-nav"> <body class="wy-body-for-nav">
@ -135,7 +135,7 @@ and code to <code class="docutils literal notranslate"><span class="pre">icefall
</div> </div>
</div> </div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer"> <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="../recipes/TTS/vctk/vits.html" class="btn btn-neutral float-left" title="VITS" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a> <a href="../recipes/Finetune/from_supervised/finetune_zipformer.html" class="btn btn-neutral float-left" title="Finetune from a supervised pre-trained Zipformer model" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="doc.html" class="btn btn-neutral float-right" title="Contributing to Documentation" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a> <a href="doc.html" class="btn btn-neutral float-right" title="Contributing to Documentation" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div> </div>

View File

@ -163,6 +163,10 @@ speech recognition recipes using <a class="reference external" href="https://git
<li class="toctree-l3"><a class="reference internal" href="recipes/TTS/vctk/vits.html">VITS</a></li> <li class="toctree-l3"><a class="reference internal" href="recipes/TTS/vctk/vits.html">VITS</a></li>
</ul> </ul>
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="recipes/Finetune/index.html">Fine-tune a pre-trained model</a><ul>
<li class="toctree-l3"><a class="reference internal" href="recipes/Finetune/from_supervised/finetune_zipformer.html">Finetune from a supervised pre-trained Zipformer model</a></li>
</ul>
</li>
</ul> </ul>
</li> </li>
</ul> </ul>

Binary file not shown.

View File

@ -0,0 +1,270 @@
<!DOCTYPE html>
<html class="writer-html5" lang="en">
<head>
<meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Finetune from a supervised pre-trained Zipformer model &mdash; icefall 0.1 documentation</title>
<link rel="stylesheet" type="text/css" href="../../../_static/pygments.css?v=fa44fd50" />
<link rel="stylesheet" type="text/css" href="../../../_static/css/theme.css?v=19f00094" />
<!--[if lt IE 9]>
<script src="../../../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script src="../../../_static/jquery.js?v=5d32c60e"></script>
<script src="../../../_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
<script data-url_root="../../../" id="documentation_options" src="../../../_static/documentation_options.js?v=e031e9a9"></script>
<script src="../../../_static/doctools.js?v=888ff710"></script>
<script src="../../../_static/sphinx_highlight.js?v=4825356b"></script>
<script src="../../../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../../../genindex.html" />
<link rel="search" title="Search" href="../../../search.html" />
<link rel="next" title="Contributing" href="../../../contributing/index.html" />
<link rel="prev" title="Fine-tune a pre-trained model" href="../index.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="../../../index.html" class="icon icon-home">
icefall
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../../for-dummies/index.html">Icefall for dummies tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../installation/index.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../docker/index.html">Docker</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../faqs.html">Frequently Asked Questions (FAQs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../model-export/index.html">Model export</a></li>
</ul>
<ul class="current">
<li class="toctree-l1 current"><a class="reference internal" href="../../index.html">Recipes</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../../Non-streaming-ASR/index.html">Non Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="../index.html">Fine-tune a pre-trained model</a><ul class="current">
<li class="toctree-l3 current"><a class="current reference internal" href="#">Finetune from a supervised pre-trained Zipformer model</a><ul>
<li class="toctree-l4"><a class="reference internal" href="#data-preparation">Data preparation</a></li>
<li class="toctree-l4"><a class="reference internal" href="#model-preparation">Model preparation</a></li>
<li class="toctree-l4"><a class="reference internal" href="#fine-tune">Fine-tune</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../../contributing/index.html">Contributing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../huggingface/index.html">Huggingface</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../../decoding-with-langugage-models/index.html">Decoding with language models</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../../../index.html">icefall</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="../../../index.html" class="icon icon-home" aria-label="Home"></a></li>
<li class="breadcrumb-item"><a href="../../index.html">Recipes</a></li>
<li class="breadcrumb-item"><a href="../index.html">Fine-tune a pre-trained model</a></li>
<li class="breadcrumb-item active">Finetune from a supervised pre-trained Zipformer model</li>
<li class="wy-breadcrumbs-aside">
<a href="https://github.com/k2-fsa/icefall/blob/master/docs/source/recipes/Finetune/from_supervised/finetune_zipformer.rst" class="fa fa-github"> Edit on GitHub</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<section id="finetune-from-a-supervised-pre-trained-zipformer-model">
<h1>Finetune from a supervised pre-trained Zipformer model<a class="headerlink" href="#finetune-from-a-supervised-pre-trained-zipformer-model" title="Permalink to this heading"></a></h1>
<p>This tutorial shows you how to fine-tune a supervised pre-trained <strong>Zipformer</strong>
transducer model on a new dataset.</p>
<div class="admonition hint">
<p class="admonition-title">Hint</p>
<p>We assume you have read the page <a class="reference internal" href="../../../installation/index.html#install-icefall"><span class="std std-ref">Installation</span></a> and have setup
the environment for <code class="docutils literal notranslate"><span class="pre">icefall</span></code>.</p>
</div>
<div class="admonition hint">
<p class="admonition-title">Hint</p>
<p>We recommend you to use a GPU or several GPUs to run this recipe</p>
</div>
<p>For illustration purpose, we fine-tune the Zipformer transducer model
pre-trained on <a class="reference external" href="https://www.openslr.org/12">LibriSpeech</a> on the small subset of <a class="reference external" href="https://github.com/SpeechColab/GigaSpeech">GigaSpeech</a>. You could use your
own data for fine-tuning if you create a manifest for your new dataset.</p>
<section id="data-preparation">
<h2>Data preparation<a class="headerlink" href="#data-preparation" title="Permalink to this heading"></a></h2>
<p>Please follow the instructions in the <a class="reference external" href="https://github.com/k2-fsa/icefall/tree/master/egs/gigaspeech/ASR">GigaSpeech recipe</a>
to prepare the fine-tune data used in this tutorial. We only require the small subset in GigaSpeech for this tutorial.</p>
</section>
<section id="model-preparation">
<h2>Model preparation<a class="headerlink" href="#model-preparation" title="Permalink to this heading"></a></h2>
<p>We are using the Zipformer model trained on full LibriSpeech (960 hours) as the intialization. The
checkpoint of the model can be downloaded via the following command:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>$<span class="w"> </span><span class="nv">GIT_LFS_SKIP_SMUDGE</span><span class="o">=</span><span class="m">1</span><span class="w"> </span>git<span class="w"> </span>clone<span class="w"> </span>https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-2023-05-15
$<span class="w"> </span><span class="nb">cd</span><span class="w"> </span>icefall-asr-librispeech-zipformer-2023-05-15/exp
$<span class="w"> </span>git<span class="w"> </span>lfs<span class="w"> </span>pull<span class="w"> </span>--include<span class="w"> </span><span class="s2">&quot;pretrained.pt&quot;</span>
$<span class="w"> </span>ln<span class="w"> </span>-s<span class="w"> </span>pretrained.pt<span class="w"> </span>epoch-99.pt
$<span class="w"> </span><span class="nb">cd</span><span class="w"> </span>../data/lang_bpe_500
$<span class="w"> </span>git<span class="w"> </span>lfs<span class="w"> </span>pull<span class="w"> </span>--include<span class="w"> </span>bpe.model
$<span class="w"> </span><span class="nb">cd</span><span class="w"> </span>../../..
</pre></div>
</div>
<p>Before fine-tuning, lets test the models WER on the new domain. The following command performs
decoding on the GigaSpeech test sets:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>./zipformer/decode_gigaspeech.py<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--epoch<span class="w"> </span><span class="m">99</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--avg<span class="w"> </span><span class="m">1</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--exp-dir<span class="w"> </span>icefall-asr-librispeech-zipformer-2023-05-15/exp<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--use-averaged-model<span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--max-duration<span class="w"> </span><span class="m">1000</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--decoding-method<span class="w"> </span>greedy_search
</pre></div>
</div>
<p>You should see the following numbers:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">For</span> <span class="n">dev</span><span class="p">,</span> <span class="n">WER</span> <span class="n">of</span> <span class="n">different</span> <span class="n">settings</span> <span class="n">are</span><span class="p">:</span>
<span class="n">greedy_search</span> <span class="mf">20.06</span> <span class="n">best</span> <span class="k">for</span> <span class="n">dev</span>
<span class="n">For</span> <span class="n">test</span><span class="p">,</span> <span class="n">WER</span> <span class="n">of</span> <span class="n">different</span> <span class="n">settings</span> <span class="n">are</span><span class="p">:</span>
<span class="n">greedy_search</span> <span class="mf">19.27</span> <span class="n">best</span> <span class="k">for</span> <span class="n">test</span>
</pre></div>
</div>
</section>
<section id="fine-tune">
<h2>Fine-tune<a class="headerlink" href="#fine-tune" title="Permalink to this heading"></a></h2>
<p>Since LibriSpeech and GigaSpeech are both English dataset, we can initialize the whole
Zipformer model with the checkpoint downloaded in the previous step (otherwise we should consider
initializing the stateless decoder and joiner from scratch due to the mismatch of the output
vocabulary). The following command starts a fine-tuning experiment:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>$<span class="w"> </span><span class="nv">use_mux</span><span class="o">=</span><span class="m">0</span>
$<span class="w"> </span><span class="nv">do_finetune</span><span class="o">=</span><span class="m">1</span>
$<span class="w"> </span>./zipformer/finetune.py<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--world-size<span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--num-epochs<span class="w"> </span><span class="m">20</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--start-epoch<span class="w"> </span><span class="m">1</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--exp-dir<span class="w"> </span>zipformer/exp_giga_finetune<span class="si">${</span><span class="nv">do_finetune</span><span class="si">}</span>_mux<span class="si">${</span><span class="nv">use_mux</span><span class="si">}</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--use-fp16<span class="w"> </span><span class="m">1</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--base-lr<span class="w"> </span><span class="m">0</span>.0045<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--bpe-model<span class="w"> </span>data/lang_bpe_500/bpe.model<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--do-finetune<span class="w"> </span><span class="nv">$do_finetune</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--use-mux<span class="w"> </span><span class="nv">$use_mux</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--master-port<span class="w"> </span><span class="m">13024</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--finetune-ckpt<span class="w"> </span>icefall-asr-librispeech-zipformer-2023-05-15/exp/pretrained.pt<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--max-duration<span class="w"> </span><span class="m">1000</span>
</pre></div>
</div>
<p>The following arguments are related to fine-tuning:</p>
<ul class="simple">
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">--base-lr</span></code></dt><dd><p>The learning rate used for fine-tuning. We suggest to set a <strong>small</strong> learning rate for fine-tuning,
otherwise the model may forget the initialization very quickly. A reasonable value should be around
1/10 of the original lr, i.e 0.0045.</p>
</dd>
</dl>
</li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">--do-finetune</span></code></dt><dd><p>If True, do fine-tuning by initializing the model from a pre-trained checkpoint.
<strong>Note that if you want to resume your fine-tuning experiment from certain epochs, you
need to set this to False.</strong></p>
</dd>
</dl>
</li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">--finetune-ckpt</span></code></dt><dd><p>The path to the pre-trained checkpoint (used for initialization).</p>
</dd>
</dl>
</li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">--use-mux</span></code></dt><dd><p>If True, mix the fine-tune data with the original training data by using <a class="reference external" href="https://lhotse.readthedocs.io/en/latest/api.html#lhotse.supervision.SupervisionSet.mux">CutSet.mux</a>
This helps maintain the models performance on the original domain if the original training
is available. <strong>If you dont have the original training data, please set it to False.</strong></p>
</dd>
</dl>
</li>
</ul>
<p>After fine-tuning, lets test the WERs. You can do this via the following command:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>$<span class="w"> </span><span class="nv">use_mux</span><span class="o">=</span><span class="m">0</span>
$<span class="w"> </span><span class="nv">do_finetune</span><span class="o">=</span><span class="m">1</span>
$<span class="w"> </span>./zipformer/decode_gigaspeech.py<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--epoch<span class="w"> </span><span class="m">20</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--avg<span class="w"> </span><span class="m">10</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--exp-dir<span class="w"> </span>zipformer/exp_giga_finetune<span class="si">${</span><span class="nv">do_finetune</span><span class="si">}</span>_mux<span class="si">${</span><span class="nv">use_mux</span><span class="si">}</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--use-averaged-model<span class="w"> </span><span class="m">1</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--max-duration<span class="w"> </span><span class="m">1000</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>--decoding-method<span class="w"> </span>greedy_search
</pre></div>
</div>
<p>You should see numbers similar to the ones below:</p>
<div class="highlight-text notranslate"><div class="highlight"><pre><span></span>For dev, WER of different settings are:
greedy_search 13.47 best for dev
For test, WER of different settings are:
greedy_search 13.66 best for test
</pre></div>
</div>
<p>Compared to the original checkpoint, the fine-tuned model achieves much lower WERs
on the GigaSpeech test sets.</p>
</section>
</section>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="../index.html" class="btn btn-neutral float-left" title="Fine-tune a pre-trained model" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="../../../contributing/index.html" class="btn btn-neutral float-right" title="Contributing" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>&#169; Copyright 2021, icefall development team.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>

152
recipes/Finetune/index.html Normal file
View File

@ -0,0 +1,152 @@
<!DOCTYPE html>
<html class="writer-html5" lang="en">
<head>
<meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Fine-tune a pre-trained model &mdash; icefall 0.1 documentation</title>
<link rel="stylesheet" type="text/css" href="../../_static/pygments.css?v=fa44fd50" />
<link rel="stylesheet" type="text/css" href="../../_static/css/theme.css?v=19f00094" />
<!--[if lt IE 9]>
<script src="../../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script src="../../_static/jquery.js?v=5d32c60e"></script>
<script src="../../_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js?v=e031e9a9"></script>
<script src="../../_static/doctools.js?v=888ff710"></script>
<script src="../../_static/sphinx_highlight.js?v=4825356b"></script>
<script src="../../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../../genindex.html" />
<link rel="search" title="Search" href="../../search.html" />
<link rel="next" title="Finetune from a supervised pre-trained Zipformer model" href="from_supervised/finetune_zipformer.html" />
<link rel="prev" title="VITS" href="../TTS/vctk/vits.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="../../index.html" class="icon icon-home">
icefall
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../for-dummies/index.html">Icefall for dummies tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../installation/index.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../docker/index.html">Docker</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../faqs.html">Frequently Asked Questions (FAQs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../model-export/index.html">Model export</a></li>
</ul>
<ul class="current">
<li class="toctree-l1 current"><a class="reference internal" href="../index.html">Recipes</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../Non-streaming-ASR/index.html">Non Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li>
<li class="toctree-l2 current"><a class="current reference internal" href="#">Fine-tune a pre-trained model</a><ul>
<li class="toctree-l3"><a class="reference internal" href="from_supervised/finetune_zipformer.html">Finetune from a supervised pre-trained Zipformer model</a></li>
</ul>
</li>
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../contributing/index.html">Contributing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../huggingface/index.html">Huggingface</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../decoding-with-langugage-models/index.html">Decoding with language models</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../../index.html">icefall</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="../../index.html" class="icon icon-home" aria-label="Home"></a></li>
<li class="breadcrumb-item"><a href="../index.html">Recipes</a></li>
<li class="breadcrumb-item active">Fine-tune a pre-trained model</li>
<li class="wy-breadcrumbs-aside">
<a href="https://github.com/k2-fsa/icefall/blob/master/docs/source/recipes/Finetune/index.rst" class="fa fa-github"> Edit on GitHub</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<section id="fine-tune-a-pre-trained-model">
<h1>Fine-tune a pre-trained model<a class="headerlink" href="#fine-tune-a-pre-trained-model" title="Permalink to this heading"></a></h1>
<p>After pre-training on public available datasets, the ASR model is already capable of
performing general speech recognition with relatively high accuracy. However, the accuracy
could be still low on certain domains that are quite different from the original training
set. In this case, we can fine-tune the model with a small amount of additional labelled
data to improve the performance on new domains.</p>
<div class="toctree-wrapper compound">
<p class="caption" role="heading"><span class="caption-text">Table of Contents</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="from_supervised/finetune_zipformer.html">Finetune from a supervised pre-trained Zipformer model</a><ul>
<li class="toctree-l2"><a class="reference internal" href="from_supervised/finetune_zipformer.html#data-preparation">Data preparation</a></li>
<li class="toctree-l2"><a class="reference internal" href="from_supervised/finetune_zipformer.html#model-preparation">Model preparation</a></li>
<li class="toctree-l2"><a class="reference internal" href="from_supervised/finetune_zipformer.html#fine-tune">Fine-tune</a></li>
</ul>
</li>
</ul>
</div>
</section>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="../TTS/vctk/vits.html" class="btn btn-neutral float-left" title="VITS" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="from_supervised/finetune_zipformer.html" class="btn btn-neutral float-right" title="Finetune from a supervised pre-trained Zipformer model" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>&#169; Copyright 2021, icefall development team.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>

View File

@ -69,6 +69,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -69,6 +69,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -69,6 +69,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -69,6 +69,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -64,6 +64,7 @@
<li class="toctree-l2"><a class="reference internal" href="../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -72,6 +72,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -68,6 +68,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -68,6 +68,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -68,6 +68,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -67,6 +67,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -67,6 +67,7 @@
<li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="../../Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -61,6 +61,7 @@
</ul> </ul>
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -61,6 +61,7 @@
</ul> </ul>
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -62,6 +62,7 @@
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -66,6 +66,7 @@
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -67,6 +67,7 @@
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -67,6 +67,7 @@
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -67,6 +67,7 @@
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -67,6 +67,7 @@
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="../../RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="../../TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -62,6 +62,7 @@
<li class="toctree-l3"><a class="reference internal" href="vctk/vits.html">VITS</a></li> <li class="toctree-l3"><a class="reference internal" href="vctk/vits.html">VITS</a></li>
</ul> </ul>
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -70,6 +70,7 @@
<li class="toctree-l3"><a class="reference internal" href="../vctk/vits.html">VITS</a></li> <li class="toctree-l3"><a class="reference internal" href="../vctk/vits.html">VITS</a></li>
</ul> </ul>
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -21,7 +21,7 @@
<script src="../../../_static/js/theme.js"></script> <script src="../../../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../../../genindex.html" /> <link rel="index" title="Index" href="../../../genindex.html" />
<link rel="search" title="Search" href="../../../search.html" /> <link rel="search" title="Search" href="../../../search.html" />
<link rel="next" title="Contributing" href="../../../contributing/index.html" /> <link rel="next" title="Fine-tune a pre-trained model" href="../../Finetune/index.html" />
<link rel="prev" title="VITS" href="../ljspeech/vits.html" /> <link rel="prev" title="VITS" href="../ljspeech/vits.html" />
</head> </head>
@ -70,6 +70,7 @@
</li> </li>
</ul> </ul>
</li> </li>
<li class="toctree-l2"><a class="reference internal" href="../../Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>
@ -219,7 +220,7 @@ by visiting the following link:</p>
</div> </div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer"> <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="../ljspeech/vits.html" class="btn btn-neutral float-left" title="VITS" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a> <a href="../ljspeech/vits.html" class="btn btn-neutral float-left" title="VITS" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="../../../contributing/index.html" class="btn btn-neutral float-right" title="Contributing" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a> <a href="../../Finetune/index.html" class="btn btn-neutral float-right" title="Fine-tune a pre-trained model" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div> </div>
<hr/> <hr/>

View File

@ -58,6 +58,7 @@
<li class="toctree-l2"><a class="reference internal" href="Streaming-ASR/index.html">Streaming ASR</a></li> <li class="toctree-l2"><a class="reference internal" href="Streaming-ASR/index.html">Streaming ASR</a></li>
<li class="toctree-l2"><a class="reference internal" href="RNN-LM/index.html">RNN-LM</a></li> <li class="toctree-l2"><a class="reference internal" href="RNN-LM/index.html">RNN-LM</a></li>
<li class="toctree-l2"><a class="reference internal" href="TTS/index.html">TTS</a></li> <li class="toctree-l2"><a class="reference internal" href="TTS/index.html">TTS</a></li>
<li class="toctree-l2"><a class="reference internal" href="Finetune/index.html">Fine-tune a pre-trained model</a></li>
</ul> </ul>
</li> </li>
</ul> </ul>
@ -122,6 +123,10 @@ Currently, we provide recipes for speech recognition, language model, and speech
<li class="toctree-l2"><a class="reference internal" href="TTS/vctk/vits.html">VITS</a></li> <li class="toctree-l2"><a class="reference internal" href="TTS/vctk/vits.html">VITS</a></li>
</ul> </ul>
</li> </li>
<li class="toctree-l1"><a class="reference internal" href="Finetune/index.html">Fine-tune a pre-trained model</a><ul>
<li class="toctree-l2"><a class="reference internal" href="Finetune/from_supervised/finetune_zipformer.html">Finetune from a supervised pre-trained Zipformer model</a></li>
</ul>
</li>
</ul> </ul>
</div> </div>
</section> </section>

File diff suppressed because one or more lines are too long