mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-09 18:12:19 +00:00
612 lines
132 KiB
HTML
612 lines
132 KiB
HTML
<!DOCTYPE html>
|
||
<html class="writer-html5" lang="en">
|
||
<head>
|
||
<meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
|
||
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||
<title>Export streaming Zipformer transducer models to ncnn — icefall 0.1 documentation</title>
|
||
<link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=fa44fd50" />
|
||
<link rel="stylesheet" type="text/css" href="../_static/css/theme.css?v=19f00094" />
|
||
|
||
|
||
<!--[if lt IE 9]>
|
||
<script src="../_static/js/html5shiv.min.js"></script>
|
||
<![endif]-->
|
||
|
||
<script src="../_static/jquery.js?v=5d32c60e"></script>
|
||
<script src="../_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
|
||
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js?v=e031e9a9"></script>
|
||
<script src="../_static/doctools.js?v=888ff710"></script>
|
||
<script src="../_static/sphinx_highlight.js?v=4825356b"></script>
|
||
<script src="../_static/js/theme.js"></script>
|
||
<link rel="index" title="Index" href="../genindex.html" />
|
||
<link rel="search" title="Search" href="../search.html" />
|
||
<link rel="next" title="Export ConvEmformer transducer models to ncnn" href="export-ncnn-conv-emformer.html" />
|
||
<link rel="prev" title="Export to ncnn" href="export-ncnn.html" />
|
||
</head>
|
||
|
||
<body class="wy-body-for-nav">
|
||
<div class="wy-grid-for-nav">
|
||
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||
<div class="wy-side-scroll">
|
||
<div class="wy-side-nav-search" >
|
||
|
||
|
||
|
||
<a href="../index.html" class="icon icon-home">
|
||
icefall
|
||
</a>
|
||
<div role="search">
|
||
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
|
||
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
|
||
<input type="hidden" name="check_keywords" value="yes" />
|
||
<input type="hidden" name="area" value="default" />
|
||
</form>
|
||
</div>
|
||
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
|
||
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
|
||
<ul class="current">
|
||
<li class="toctree-l1"><a class="reference internal" href="../for-dummies/index.html">Icefall for dummies tutorial</a></li>
|
||
<li class="toctree-l1"><a class="reference internal" href="../installation/index.html">Installation</a></li>
|
||
<li class="toctree-l1"><a class="reference internal" href="../docker/index.html">Docker</a></li>
|
||
<li class="toctree-l1"><a class="reference internal" href="../faqs.html">Frequently Asked Questions (FAQs)</a></li>
|
||
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Model export</a><ul class="current">
|
||
<li class="toctree-l2"><a class="reference internal" href="export-model-state-dict.html">Export model.state_dict()</a></li>
|
||
<li class="toctree-l2"><a class="reference internal" href="export-with-torch-jit-trace.html">Export model with torch.jit.trace()</a></li>
|
||
<li class="toctree-l2"><a class="reference internal" href="export-with-torch-jit-script.html">Export model with torch.jit.script()</a></li>
|
||
<li class="toctree-l2"><a class="reference internal" href="export-onnx.html">Export to ONNX</a></li>
|
||
<li class="toctree-l2 current"><a class="reference internal" href="export-ncnn.html">Export to ncnn</a><ul class="current">
|
||
<li class="toctree-l3 current"><a class="current reference internal" href="#">Export streaming Zipformer transducer models to ncnn</a><ul>
|
||
<li class="toctree-l4"><a class="reference internal" href="#download-the-pre-trained-model">1. Download the pre-trained model</a></li>
|
||
<li class="toctree-l4"><a class="reference internal" href="#install-ncnn-and-pnnx">2. Install ncnn and pnnx</a></li>
|
||
<li class="toctree-l4"><a class="reference internal" href="#export-the-model-via-torch-jit-trace">3. Export the model via torch.jit.trace()</a></li>
|
||
<li class="toctree-l4"><a class="reference internal" href="#export-torchscript-model-via-pnnx">4. Export torchscript model via pnnx</a></li>
|
||
<li class="toctree-l4"><a class="reference internal" href="#test-the-exported-models-in-icefall">5. Test the exported models in icefall</a></li>
|
||
<li class="toctree-l4"><a class="reference internal" href="#modify-the-exported-encoder-for-sherpa-ncnn">6. Modify the exported encoder for sherpa-ncnn</a></li>
|
||
</ul>
|
||
</li>
|
||
<li class="toctree-l3"><a class="reference internal" href="export-ncnn-conv-emformer.html">Export ConvEmformer transducer models to ncnn</a></li>
|
||
<li class="toctree-l3"><a class="reference internal" href="export-ncnn-lstm.html">Export LSTM transducer models to ncnn</a></li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
<ul>
|
||
<li class="toctree-l1"><a class="reference internal" href="../recipes/index.html">Recipes</a></li>
|
||
</ul>
|
||
<ul>
|
||
<li class="toctree-l1"><a class="reference internal" href="../contributing/index.html">Contributing</a></li>
|
||
<li class="toctree-l1"><a class="reference internal" href="../huggingface/index.html">Huggingface</a></li>
|
||
</ul>
|
||
<ul>
|
||
<li class="toctree-l1"><a class="reference internal" href="../decoding-with-langugage-models/index.html">Decoding with language models</a></li>
|
||
</ul>
|
||
|
||
</div>
|
||
</div>
|
||
</nav>
|
||
|
||
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
|
||
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||
<a href="../index.html">icefall</a>
|
||
</nav>
|
||
|
||
<div class="wy-nav-content">
|
||
<div class="rst-content">
|
||
<div role="navigation" aria-label="Page navigation">
|
||
<ul class="wy-breadcrumbs">
|
||
<li><a href="../index.html" class="icon icon-home" aria-label="Home"></a></li>
|
||
<li class="breadcrumb-item"><a href="index.html">Model export</a></li>
|
||
<li class="breadcrumb-item"><a href="export-ncnn.html">Export to ncnn</a></li>
|
||
<li class="breadcrumb-item active">Export streaming Zipformer transducer models to ncnn</li>
|
||
<li class="wy-breadcrumbs-aside">
|
||
<a href="https://github.com/k2-fsa/icefall/blob/master/docs/source/model-export/export-ncnn-zipformer.rst" class="fa fa-github"> Edit on GitHub</a>
|
||
</li>
|
||
</ul>
|
||
<hr/>
|
||
</div>
|
||
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||
<div itemprop="articleBody">
|
||
|
||
<section id="export-streaming-zipformer-transducer-models-to-ncnn">
|
||
<span id="id1"></span><h1>Export streaming Zipformer transducer models to ncnn<a class="headerlink" href="#export-streaming-zipformer-transducer-models-to-ncnn" title="Permalink to this heading"></a></h1>
|
||
<p>We use the pre-trained model from the following repository as an example:</p>
|
||
<p><a class="reference external" href="https://huggingface.co/Zengwei/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29">https://huggingface.co/Zengwei/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29</a></p>
|
||
<p>We will show you step by step how to export it to <a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a> and run it with <a class="reference external" href="https://github.com/k2-fsa/sherpa-ncnn">sherpa-ncnn</a>.</p>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p>We use <code class="docutils literal notranslate"><span class="pre">Ubuntu</span> <span class="pre">18.04</span></code>, <code class="docutils literal notranslate"><span class="pre">torch</span> <span class="pre">1.13</span></code>, and <code class="docutils literal notranslate"><span class="pre">Python</span> <span class="pre">3.8</span></code> for testing.</p>
|
||
</div>
|
||
<div class="admonition caution">
|
||
<p class="admonition-title">Caution</p>
|
||
<p>Please use a more recent version of PyTorch. For instance, <code class="docutils literal notranslate"><span class="pre">torch</span> <span class="pre">1.8</span></code>
|
||
may <code class="docutils literal notranslate"><span class="pre">not</span></code> work.</p>
|
||
</div>
|
||
<section id="download-the-pre-trained-model">
|
||
<h2>1. Download the pre-trained model<a class="headerlink" href="#download-the-pre-trained-model" title="Permalink to this heading"></a></h2>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p>You have to install <a class="reference external" href="https://git-lfs.com/">git-lfs</a> before you continue.</p>
|
||
</div>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nb">cd</span><span class="w"> </span>egs/librispeech/ASR
|
||
<span class="nv">GIT_LFS_SKIP_SMUDGE</span><span class="o">=</span><span class="m">1</span><span class="w"> </span>git<span class="w"> </span>clone<span class="w"> </span>https://huggingface.co/Zengwei/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29
|
||
<span class="nb">cd</span><span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29
|
||
|
||
git<span class="w"> </span>lfs<span class="w"> </span>pull<span class="w"> </span>--include<span class="w"> </span><span class="s2">"exp/pretrained.pt"</span>
|
||
git<span class="w"> </span>lfs<span class="w"> </span>pull<span class="w"> </span>--include<span class="w"> </span><span class="s2">"data/lang_bpe_500/bpe.model"</span>
|
||
|
||
<span class="nb">cd</span><span class="w"> </span>..
|
||
</pre></div>
|
||
</div>
|
||
<div class="admonition note">
|
||
<p class="admonition-title">Note</p>
|
||
<p>We downloaded <code class="docutils literal notranslate"><span class="pre">exp/pretrained-xxx.pt</span></code>, not <code class="docutils literal notranslate"><span class="pre">exp/cpu-jit_xxx.pt</span></code>.</p>
|
||
</div>
|
||
<p>In the above code, we downloaded the pre-trained model into the directory
|
||
<code class="docutils literal notranslate"><span class="pre">egs/librispeech/ASR/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29</span></code>.</p>
|
||
</section>
|
||
<section id="install-ncnn-and-pnnx">
|
||
<h2>2. Install ncnn and pnnx<a class="headerlink" href="#install-ncnn-and-pnnx" title="Permalink to this heading"></a></h2>
|
||
<p>Please refer to <a class="reference internal" href="export-ncnn-conv-emformer.html#export-for-ncnn-install-ncnn-and-pnnx"><span class="std std-ref">2. Install ncnn and pnnx</span></a> .</p>
|
||
</section>
|
||
<section id="export-the-model-via-torch-jit-trace">
|
||
<h2>3. Export the model via torch.jit.trace()<a class="headerlink" href="#export-the-model-via-torch-jit-trace" title="Permalink to this heading"></a></h2>
|
||
<p>First, let us rename our pre-trained model:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span>
|
||
|
||
<span class="n">cd</span> <span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">exp</span>
|
||
|
||
<span class="n">ln</span> <span class="o">-</span><span class="n">s</span> <span class="n">pretrained</span><span class="o">.</span><span class="n">pt</span> <span class="n">epoch</span><span class="o">-</span><span class="mf">99.</span><span class="n">pt</span>
|
||
|
||
<span class="n">cd</span> <span class="o">../..</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Next, we use the following code to export our model:</p>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nv">dir</span><span class="o">=</span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29
|
||
|
||
./pruned_transducer_stateless7_streaming/export-for-ncnn.py<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--tokens<span class="w"> </span><span class="nv">$dir</span>/data/lang_bpe_500/tokens.txt<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--exp-dir<span class="w"> </span><span class="nv">$dir</span>/exp<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--use-averaged-model<span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--epoch<span class="w"> </span><span class="m">99</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--avg<span class="w"> </span><span class="m">1</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--decode-chunk-len<span class="w"> </span><span class="m">32</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--num-left-chunks<span class="w"> </span><span class="m">4</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--num-encoder-layers<span class="w"> </span><span class="s2">"2,4,3,2,4"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--feedforward-dims<span class="w"> </span><span class="s2">"1024,1024,2048,2048,1024"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--nhead<span class="w"> </span><span class="s2">"8,8,8,8,8"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--encoder-dims<span class="w"> </span><span class="s2">"384,384,384,384,384"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--attention-dims<span class="w"> </span><span class="s2">"192,192,192,192,192"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--encoder-unmasked-dims<span class="w"> </span><span class="s2">"256,256,256,256,256"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--zipformer-downsampling-factors<span class="w"> </span><span class="s2">"1,2,4,8,2"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--cnn-module-kernels<span class="w"> </span><span class="s2">"31,31,31,31,31"</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--decoder-dim<span class="w"> </span><span class="m">512</span><span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--joiner-dim<span class="w"> </span><span class="m">512</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="admonition caution">
|
||
<p class="admonition-title">Caution</p>
|
||
<p>If your model has different configuration parameters, please change them accordingly.</p>
|
||
</div>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p>We have renamed our model to <code class="docutils literal notranslate"><span class="pre">epoch-99.pt</span></code> so that we can use <code class="docutils literal notranslate"><span class="pre">--epoch</span> <span class="pre">99</span></code>.
|
||
There is only one pre-trained model, so we use <code class="docutils literal notranslate"><span class="pre">--avg</span> <span class="pre">1</span> <span class="pre">--use-averaged-model</span> <span class="pre">0</span></code>.</p>
|
||
<p>If you have trained a model by yourself and if you have all checkpoints
|
||
available, please first use <code class="docutils literal notranslate"><span class="pre">decode.py</span></code> to tune <code class="docutils literal notranslate"><span class="pre">--epoch</span> <span class="pre">--avg</span></code>
|
||
and select the best combination with with <code class="docutils literal notranslate"><span class="pre">--use-averaged-model</span> <span class="pre">1</span></code>.</p>
|
||
</div>
|
||
<div class="admonition note">
|
||
<p class="admonition-title">Note</p>
|
||
<p>You will see the following log output:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">07</span><span class="p">,</span><span class="mi">473</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">246</span><span class="p">]</span> <span class="n">device</span><span class="p">:</span> <span class="n">cpu</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">07</span><span class="p">,</span><span class="mi">477</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">255</span><span class="p">]</span> <span class="p">{</span><span class="s1">'best_train_loss'</span><span class="p">:</span> <span class="n">inf</span><span class="p">,</span> <span class="s1">'best_valid_loss'</span><span class="p">:</span> <span class="n">inf</span><span class="p">,</span> <span class="s1">'best_train_epoch'</span><span class="p">:</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'best_valid_epoch'</span><span class="p">:</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'batch_idx_train'</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">'log_interval'</span><span class="p">:</span> <span class="mi">50</span><span class="p">,</span> <span class="s1">'reset_interval'</span><span class="p">:</span> <span class="mi">200</span><span class="p">,</span> <span class="s1">'valid_interval'</span><span class="p">:</span> <span class="mi">3000</span><span class="p">,</span> <span class="s1">'feature_dim'</span><span class="p">:</span> <span class="mi">80</span><span class="p">,</span> <span class="s1">'subsampling_factor'</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span> <span class="s1">'warm_step'</span><span class="p">:</span> <span class="mi">2000</span><span class="p">,</span> <span class="s1">'env_info'</span><span class="p">:</span> <span class="p">{</span><span class="s1">'k2-version'</span><span class="p">:</span> <span class="s1">'1.23.4'</span><span class="p">,</span> <span class="s1">'k2-build-type'</span><span class="p">:</span> <span class="s1">'Release'</span><span class="p">,</span> <span class="s1">'k2-with-cuda'</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span> <span class="s1">'k2-git-sha1'</span><span class="p">:</span> <span class="s1">'62e404dd3f3a811d73e424199b3408e309c06e1a'</span><span class="p">,</span> <span class="s1">'k2-git-date'</span><span class="p">:</span> <span class="s1">'Mon Jan 30 10:26:16 2023'</span><span class="p">,</span> <span class="s1">'lhotse-version'</span><span class="p">:</span> <span class="s1">'1.12.0.dev+missing.version.file'</span><span class="p">,</span> <span class="s1">'torch-version'</span><span class="p">:</span> <span class="s1">'1.10.0+cu102'</span><span class="p">,</span> <span class="s1">'torch-cuda-available'</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span> <span class="s1">'torch-cuda-version'</span><span class="p">:</span> <span class="s1">'10.2'</span><span class="p">,</span> <span class="s1">'python-version'</span><span class="p">:</span> <span class="s1">'3.8'</span><span class="p">,</span> <span class="s1">'icefall-git-branch'</span><span class="p">:</span> <span class="s1">'master'</span><span class="p">,</span> <span class="s1">'icefall-git-sha1'</span><span class="p">:</span> <span class="s1">'6d7a559-clean'</span><span class="p">,</span> <span class="s1">'icefall-git-date'</span><span class="p">:</span> <span class="s1">'Thu Feb 16 19:47:54 2023'</span><span class="p">,</span> <span class="s1">'icefall-path'</span><span class="p">:</span> <span class="s1">'/star-fj/fangjun/open-source/icefall-2'</span><span class="p">,</span> <span class="s1">'k2-path'</span><span class="p">:</span> <span class="s1">'/star-fj/fangjun/open-source/k2/k2/python/k2/__init__.py'</span><span class="p">,</span> <span class="s1">'lhotse-path'</span><span class="p">:</span> <span class="s1">'/star-fj/fangjun/open-source/lhotse/lhotse/__init__.py'</span><span class="p">,</span> <span class="s1">'hostname'</span><span class="p">:</span> <span class="s1">'de-74279-k2-train-3-1220120619-7695ff496b-s9n4w'</span><span class="p">,</span> <span class="s1">'IP address'</span><span class="p">:</span> <span class="s1">'10.177.6.147'</span><span class="p">},</span> <span class="s1">'epoch'</span><span class="p">:</span> <span class="mi">99</span><span class="p">,</span> <span class="s1">'iter'</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">'avg'</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s1">'exp_dir'</span><span class="p">:</span> <span class="n">PosixPath</span><span class="p">(</span><span class="s1">'icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp'</span><span class="p">),</span> <span class="s1">'bpe_model'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/data/lang_bpe_500/bpe.model'</span><span class="p">,</span> <span class="s1">'context_size'</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="s1">'use_averaged_model'</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span> <span class="s1">'num_encoder_layers'</span><span class="p">:</span> <span class="s1">'2,4,3,2,4'</span><span class="p">,</span> <span class="s1">'feedforward_dims'</span><span class="p">:</span> <span class="s1">'1024,1024,2048,2048,1024'</span><span class="p">,</span> <span class="s1">'nhead'</span><span class="p">:</span> <span class="s1">'8,8,8,8,8'</span><span class="p">,</span> <span class="s1">'encoder_dims'</span><span class="p">:</span> <span class="s1">'384,384,384,384,384'</span><span class="p">,</span> <span class="s1">'attention_dims'</span><span class="p">:</span> <span class="s1">'192,192,192,192,192'</span><span class="p">,</span> <span class="s1">'encoder_unmasked_dims'</span><span class="p">:</span> <span class="s1">'256,256,256,256,256'</span><span class="p">,</span> <span class="s1">'zipformer_downsampling_factors'</span><span class="p">:</span> <span class="s1">'1,2,4,8,2'</span><span class="p">,</span> <span class="s1">'cnn_module_kernels'</span><span class="p">:</span> <span class="s1">'31,31,31,31,31'</span><span class="p">,</span> <span class="s1">'decoder_dim'</span><span class="p">:</span> <span class="mi">512</span><span class="p">,</span> <span class="s1">'joiner_dim'</span><span class="p">:</span> <span class="mi">512</span><span class="p">,</span> <span class="s1">'short_chunk_size'</span><span class="p">:</span> <span class="mi">50</span><span class="p">,</span> <span class="s1">'num_left_chunks'</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span> <span class="s1">'decode_chunk_len'</span><span class="p">:</span> <span class="mi">32</span><span class="p">,</span> <span class="s1">'blank_id'</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">'vocab_size'</span><span class="p">:</span> <span class="mi">500</span><span class="p">}</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">07</span><span class="p">,</span><span class="mi">477</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">257</span><span class="p">]</span> <span class="n">About</span> <span class="n">to</span> <span class="n">create</span> <span class="n">model</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">023</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">419</span><span class="p">]</span> <span class="n">At</span> <span class="n">encoder</span> <span class="n">stack</span> <span class="mi">4</span><span class="p">,</span> <span class="n">which</span> <span class="n">has</span> <span class="n">downsampling_factor</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">we</span> <span class="n">will</span> <span class="n">combine</span> <span class="n">the</span> <span class="n">outputs</span> <span class="n">of</span> <span class="n">layers</span> <span class="mi">1</span> <span class="ow">and</span> <span class="mi">3</span><span class="p">,</span> <span class="k">with</span> <span class="n">downsampling_factors</span><span class="o">=</span><span class="mi">2</span> <span class="ow">and</span> <span class="mf">8.</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">037</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">checkpoint</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">112</span><span class="p">]</span> <span class="n">Loading</span> <span class="n">checkpoint</span> <span class="kn">from</span> <span class="nn">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">exp</span><span class="o">/</span><span class="n">epoch</span><span class="o">-</span><span class="mf">99.</span><span class="n">pt</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">655</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">346</span><span class="p">]</span> <span class="n">encoder</span> <span class="n">parameters</span><span class="p">:</span> <span class="mi">68944004</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">655</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">347</span><span class="p">]</span> <span class="n">decoder</span> <span class="n">parameters</span><span class="p">:</span> <span class="mi">260096</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">655</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">348</span><span class="p">]</span> <span class="n">joiner</span> <span class="n">parameters</span><span class="p">:</span> <span class="mi">716276</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">656</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">349</span><span class="p">]</span> <span class="n">total</span> <span class="n">parameters</span><span class="p">:</span> <span class="mi">69920376</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">656</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">351</span><span class="p">]</span> <span class="n">Using</span> <span class="n">torch</span><span class="o">.</span><span class="n">jit</span><span class="o">.</span><span class="n">trace</span><span class="p">()</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">656</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">353</span><span class="p">]</span> <span class="n">Exporting</span> <span class="n">encoder</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">656</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">174</span><span class="p">]</span> <span class="n">decode_chunk_len</span><span class="p">:</span> <span class="mi">32</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">08</span><span class="p">,</span><span class="mi">656</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">175</span><span class="p">]</span> <span class="n">T</span><span class="p">:</span> <span class="mi">39</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1344</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_len</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1348</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_avg</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1352</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_key</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1356</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_val</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1360</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_val2</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1364</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_conv1</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1368</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_conv2</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_layers</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1373</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="bp">self</span><span class="o">.</span><span class="n">left_context_len</span> <span class="o">==</span> <span class="n">cached_key</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1884</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="bp">self</span><span class="o">.</span><span class="n">x_size</span> <span class="o">==</span> <span class="n">x</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">x_size</span><span class="p">,</span> <span class="n">x</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2442</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_key</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">left_context_len</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2449</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_key</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="n">cached_val</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2469</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_key</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="n">left_context_len</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2473</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_val</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="n">left_context_len</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2483</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">kv_len</span> <span class="o">==</span> <span class="n">k</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="p">(</span><span class="n">kv_len</span><span class="p">,</span> <span class="n">k</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2570</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="nb">list</span><span class="p">(</span><span class="n">attn_output</span><span class="o">.</span><span class="n">size</span><span class="p">())</span> <span class="o">==</span> <span class="p">[</span><span class="n">bsz</span> <span class="o">*</span> <span class="n">num_heads</span><span class="p">,</span> <span class="n">seq_len</span><span class="p">,</span> <span class="n">head_dim</span> <span class="o">//</span> <span class="mi">2</span><span class="p">]</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2926</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cache</span><span class="o">.</span><span class="n">shape</span> <span class="o">==</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="n">x</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="bp">self</span><span class="o">.</span><span class="n">lorder</span><span class="p">),</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2652</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">x_size</span><span class="p">,</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="bp">self</span><span class="o">.</span><span class="n">x_size</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2653</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">embed_dim</span><span class="p">,</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="bp">self</span><span class="o">.</span><span class="n">embed_dim</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">2666</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">cached_val</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">left_context_len</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1543</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">src</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_x_size</span><span class="p">,</span> <span class="p">(</span><span class="n">src</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_x_size</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1637</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">src</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_x_size</span><span class="p">,</span> <span class="p">(</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1643</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">src</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_channels</span><span class="p">,</span> <span class="p">(</span><span class="n">src</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_channels</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1571</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">if</span> <span class="n">src</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">!=</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_x_size</span><span class="p">:</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1763</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">src1</span><span class="o">.</span><span class="n">shape</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="n">src2</span><span class="o">.</span><span class="n">shape</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="p">(</span><span class="n">src1</span><span class="o">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">src2</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1779</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">src1</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">dim1</span><span class="p">,</span> <span class="p">(</span><span class="n">src1</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="bp">self</span><span class="o">.</span><span class="n">dim1</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">zipformer2</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">1780</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">src2</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">dim2</span><span class="p">,</span> <span class="p">(</span><span class="n">src2</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="bp">self</span><span class="o">.</span><span class="n">dim2</span><span class="p">)</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="n">py38</span><span class="o">/</span><span class="n">lib</span><span class="o">/</span><span class="n">python3</span><span class="mf">.8</span><span class="o">/</span><span class="n">site</span><span class="o">-</span><span class="n">packages</span><span class="o">/</span><span class="n">torch</span><span class="o">/</span><span class="n">jit</span><span class="o">/</span><span class="n">_trace</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">958</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Encountering</span> <span class="n">a</span> <span class="nb">list</span> <span class="n">at</span> <span class="n">the</span> <span class="n">output</span> <span class="n">of</span> <span class="n">the</span> <span class="n">tracer</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="p">,</span> <span class="n">this</span> <span class="ow">is</span> <span class="n">only</span> <span class="n">valid</span> <span class="k">if</span> <span class="n">the</span> <span class="n">container</span> <span class="n">structure</span> <span class="n">does</span> <span class="ow">not</span> <span class="n">change</span> <span class="n">based</span> <span class="n">on</span> <span class="n">the</span> <span class="n">module</span><span class="s1">'s inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.</span>
|
||
<span class="n">module</span><span class="o">.</span><span class="n">_c</span><span class="o">.</span><span class="n">_create_method_from_trace</span><span class="p">(</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="mi">640</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">182</span><span class="p">]</span> <span class="n">Saved</span> <span class="n">to</span> <span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">exp</span><span class="o">/</span><span class="n">encoder_jit_trace</span><span class="o">-</span><span class="n">pnnx</span><span class="o">.</span><span class="n">pt</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="mi">646</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">357</span><span class="p">]</span> <span class="n">Exporting</span> <span class="n">decoder</span>
|
||
<span class="o">/</span><span class="n">star</span><span class="o">-</span><span class="n">fj</span><span class="o">/</span><span class="n">fangjun</span><span class="o">/</span><span class="nb">open</span><span class="o">-</span><span class="n">source</span><span class="o">/</span><span class="n">icefall</span><span class="o">-</span><span class="mi">2</span><span class="o">/</span><span class="n">egs</span><span class="o">/</span><span class="n">librispeech</span><span class="o">/</span><span class="n">ASR</span><span class="o">/</span><span class="n">pruned_transducer_stateless7_streaming</span><span class="o">/</span><span class="n">decoder</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">102</span><span class="p">:</span> <span class="n">TracerWarning</span><span class="p">:</span> <span class="n">Converting</span> <span class="n">a</span> <span class="n">tensor</span> <span class="n">to</span> <span class="n">a</span> <span class="n">Python</span> <span class="n">boolean</span> <span class="n">might</span> <span class="n">cause</span> <span class="n">the</span> <span class="n">trace</span> <span class="n">to</span> <span class="n">be</span> <span class="n">incorrect</span><span class="o">.</span> <span class="n">We</span> <span class="n">can</span><span class="s1">'t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!</span>
|
||
<span class="k">assert</span> <span class="n">embedding_out</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">context_size</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="mi">686</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">204</span><span class="p">]</span> <span class="n">Saved</span> <span class="n">to</span> <span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">exp</span><span class="o">/</span><span class="n">decoder_jit_trace</span><span class="o">-</span><span class="n">pnnx</span><span class="o">.</span><span class="n">pt</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="mi">686</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">361</span><span class="p">]</span> <span class="n">Exporting</span> <span class="n">joiner</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">23</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="mi">735</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">export</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ncnn</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">231</span><span class="p">]</span> <span class="n">Saved</span> <span class="n">to</span> <span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">exp</span><span class="o">/</span><span class="n">joiner_jit_trace</span><span class="o">-</span><span class="n">pnnx</span><span class="o">.</span><span class="n">pt</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The log shows the model has <code class="docutils literal notranslate"><span class="pre">69920376</span></code> parameters, i.e., <code class="docutils literal notranslate"><span class="pre">~69.9</span> <span class="pre">M</span></code>.</p>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>ls<span class="w"> </span>-lh<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/pretrained.pt
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span>269M<span class="w"> </span>Jan<span class="w"> </span><span class="m">12</span><span class="w"> </span><span class="m">12</span>:53<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/pretrained.pt
|
||
</pre></div>
|
||
</div>
|
||
<p>You can see that the file size of the pre-trained model is <code class="docutils literal notranslate"><span class="pre">269</span> <span class="pre">MB</span></code>, which
|
||
is roughly equal to <code class="docutils literal notranslate"><span class="pre">69920376*4/1024/1024</span> <span class="pre">=</span> <span class="pre">266.725</span> <span class="pre">MB</span></code>.</p>
|
||
</div>
|
||
<p>After running <code class="docutils literal notranslate"><span class="pre">pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>,
|
||
we will get the following files:</p>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>ls<span class="w"> </span>-lh<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/*pnnx.pt
|
||
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span>1022K<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:23<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.pt
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span>266M<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:23<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.pt
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span><span class="m">2</span>.8M<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:23<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.pt
|
||
</pre></div>
|
||
</div>
|
||
</section>
|
||
<section id="export-torchscript-model-via-pnnx">
|
||
<span id="zipformer-transducer-step-4-export-torchscript-model-via-pnnx"></span><h2>4. Export torchscript model via pnnx<a class="headerlink" href="#export-torchscript-model-via-pnnx" title="Permalink to this heading"></a></h2>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p>Make sure you have set up the <code class="docutils literal notranslate"><span class="pre">PATH</span></code> environment variable
|
||
in <a class="reference internal" href="export-ncnn-conv-emformer.html#export-for-ncnn-install-ncnn-and-pnnx"><span class="std std-ref">2. Install ncnn and pnnx</span></a>. Otherwise,
|
||
it will throw an error saying that <code class="docutils literal notranslate"><span class="pre">pnnx</span></code> could not be found.</p>
|
||
</div>
|
||
<p>Now, it’s time to export our models to <a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a> via <code class="docutils literal notranslate"><span class="pre">pnnx</span></code>.</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">exp</span><span class="o">/</span>
|
||
|
||
<span class="n">pnnx</span> <span class="o">./</span><span class="n">encoder_jit_trace</span><span class="o">-</span><span class="n">pnnx</span><span class="o">.</span><span class="n">pt</span>
|
||
<span class="n">pnnx</span> <span class="o">./</span><span class="n">decoder_jit_trace</span><span class="o">-</span><span class="n">pnnx</span><span class="o">.</span><span class="n">pt</span>
|
||
<span class="n">pnnx</span> <span class="o">./</span><span class="n">joiner_jit_trace</span><span class="o">-</span><span class="n">pnnx</span><span class="o">.</span><span class="n">pt</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>It will generate the following files:</p>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>ls<span class="w"> </span>-lh<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/*ncnn*<span class="o">{</span>bin,param<span class="o">}</span>
|
||
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span>509K<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:31<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.ncnn.bin
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span><span class="m">437</span><span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:31<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.ncnn.param
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span>133M<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:30<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.ncnn.bin
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span>152K<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:30<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.ncnn.param
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span><span class="m">1</span>.4M<span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:31<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.ncnn.bin
|
||
-rw-r--r--<span class="w"> </span><span class="m">1</span><span class="w"> </span>kuangfangjun<span class="w"> </span>root<span class="w"> </span><span class="m">488</span><span class="w"> </span>Feb<span class="w"> </span><span class="m">27</span><span class="w"> </span><span class="m">20</span>:31<span class="w"> </span>icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.ncnn.param
|
||
</pre></div>
|
||
</div>
|
||
<p>There are two types of files:</p>
|
||
<ul class="simple">
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">param</span></code>: It is a text file containing the model architectures. You can
|
||
use a text editor to view its content.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">bin</span></code>: It is a binary file containing the model parameters.</p></li>
|
||
</ul>
|
||
<p>We compare the file sizes of the models below before and after converting via <code class="docutils literal notranslate"><span class="pre">pnnx</span></code>:</p>
|
||
<table class="docutils align-default">
|
||
<thead>
|
||
<tr class="row-odd"><th class="head"><p>File name</p></th>
|
||
<th class="head"><p>File size</p></th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr class="row-even"><td><p>encoder_jit_trace-pnnx.pt</p></td>
|
||
<td><p>266 MB</p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>decoder_jit_trace-pnnx.pt</p></td>
|
||
<td><p>1022 KB</p></td>
|
||
</tr>
|
||
<tr class="row-even"><td><p>joiner_jit_trace-pnnx.pt</p></td>
|
||
<td><p>2.8 MB</p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>encoder_jit_trace-pnnx.ncnn.bin</p></td>
|
||
<td><p>133 MB</p></td>
|
||
</tr>
|
||
<tr class="row-even"><td><p>decoder_jit_trace-pnnx.ncnn.bin</p></td>
|
||
<td><p>509 KB</p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>joiner_jit_trace-pnnx.ncnn.bin</p></td>
|
||
<td><p>1.4 MB</p></td>
|
||
</tr>
|
||
</tbody>
|
||
</table>
|
||
<p>You can see that the file sizes of the models after conversion are about one half
|
||
of the models before conversion:</p>
|
||
<blockquote>
|
||
<div><ul class="simple">
|
||
<li><p>encoder: 266 MB vs 133 MB</p></li>
|
||
<li><p>decoder: 1022 KB vs 509 KB</p></li>
|
||
<li><p>joiner: 2.8 MB vs 1.4 MB</p></li>
|
||
</ul>
|
||
</div></blockquote>
|
||
<p>The reason is that by default <code class="docutils literal notranslate"><span class="pre">pnnx</span></code> converts <code class="docutils literal notranslate"><span class="pre">float32</span></code> parameters
|
||
to <code class="docutils literal notranslate"><span class="pre">float16</span></code>. A <code class="docutils literal notranslate"><span class="pre">float32</span></code> parameter occupies 4 bytes, while it is 2 bytes
|
||
for <code class="docutils literal notranslate"><span class="pre">float16</span></code>. Thus, it is <code class="docutils literal notranslate"><span class="pre">twice</span> <span class="pre">smaller</span></code> after conversion.</p>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p>If you use <code class="docutils literal notranslate"><span class="pre">pnnx</span> <span class="pre">./encoder_jit_trace-pnnx.pt</span> <span class="pre">fp16=0</span></code>, then <code class="docutils literal notranslate"><span class="pre">pnnx</span></code>
|
||
won’t convert <code class="docutils literal notranslate"><span class="pre">float32</span></code> to <code class="docutils literal notranslate"><span class="pre">float16</span></code>.</p>
|
||
</div>
|
||
</section>
|
||
<section id="test-the-exported-models-in-icefall">
|
||
<h2>5. Test the exported models in icefall<a class="headerlink" href="#test-the-exported-models-in-icefall" title="Permalink to this heading"></a></h2>
|
||
<div class="admonition note">
|
||
<p class="admonition-title">Note</p>
|
||
<p>We assume you have set up the environment variable <code class="docutils literal notranslate"><span class="pre">PYTHONPATH</span></code> when
|
||
building <a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a>.</p>
|
||
</div>
|
||
<p>Now we have successfully converted our pre-trained model to <a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a> format.
|
||
The generated 6 files are what we need. You can use the following code to
|
||
test the converted models:</p>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python3<span class="w"> </span>./pruned_transducer_stateless7_streaming/streaming-ncnn-decode.py<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--tokens<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/data/lang_bpe_500/tokens.txt<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--encoder-param-filename<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.ncnn.param<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--encoder-bin-filename<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.ncnn.bin<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--decoder-param-filename<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.ncnn.param<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--decoder-bin-filename<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.ncnn.bin<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--joiner-param-filename<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.ncnn.param<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>--joiner-bin-filename<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.ncnn.bin<span class="w"> </span><span class="se">\</span>
|
||
<span class="w"> </span>./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/test_wavs/1089-134686-0001.wav
|
||
</pre></div>
|
||
</div>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p><a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a> supports only <code class="docutils literal notranslate"><span class="pre">batch</span> <span class="pre">size</span> <span class="pre">==</span> <span class="pre">1</span></code>, so <code class="docutils literal notranslate"><span class="pre">streaming-ncnn-decode.py</span></code> accepts
|
||
only 1 wave file as input.</p>
|
||
</div>
|
||
<p>The output is given below:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">40</span><span class="p">,</span><span class="mi">283</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">349</span><span class="p">]</span> <span class="p">{</span><span class="s1">'tokens'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/data/lang_bpe_500/tokens.txt'</span><span class="p">,</span> <span class="s1">'encoder_param_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.ncnn.param'</span><span class="p">,</span> <span class="s1">'encoder_bin_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder_jit_trace-pnnx.ncnn.bin'</span><span class="p">,</span> <span class="s1">'decoder_param_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.ncnn.param'</span><span class="p">,</span> <span class="s1">'decoder_bin_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder_jit_trace-pnnx.ncnn.bin'</span><span class="p">,</span> <span class="s1">'joiner_param_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.ncnn.param'</span><span class="p">,</span> <span class="s1">'joiner_bin_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner_jit_trace-pnnx.ncnn.bin'</span><span class="p">,</span> <span class="s1">'sound_filename'</span><span class="p">:</span> <span class="s1">'./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/test_wavs/1089-134686-0001.wav'</span><span class="p">}</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">41</span><span class="p">,</span><span class="mi">260</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">357</span><span class="p">]</span> <span class="n">Constructing</span> <span class="n">Fbank</span> <span class="n">computer</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">41</span><span class="p">,</span><span class="mi">264</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">360</span><span class="p">]</span> <span class="n">Reading</span> <span class="n">sound</span> <span class="n">files</span><span class="p">:</span> <span class="o">./</span><span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">test_wavs</span><span class="o">/</span><span class="mi">1089</span><span class="o">-</span><span class="mi">134686</span><span class="o">-</span><span class="mf">0001.</span><span class="n">wav</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">41</span><span class="p">,</span><span class="mi">269</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">365</span><span class="p">]</span> <span class="n">torch</span><span class="o">.</span><span class="n">Size</span><span class="p">([</span><span class="mi">106000</span><span class="p">])</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">41</span><span class="p">,</span><span class="mi">280</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">372</span><span class="p">]</span> <span class="n">number</span> <span class="n">of</span> <span class="n">states</span><span class="p">:</span> <span class="mi">35</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">45</span><span class="p">,</span><span class="mi">026</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">410</span><span class="p">]</span> <span class="o">./</span><span class="n">icefall</span><span class="o">-</span><span class="n">asr</span><span class="o">-</span><span class="n">librispeech</span><span class="o">-</span><span class="n">pruned</span><span class="o">-</span><span class="n">transducer</span><span class="o">-</span><span class="n">stateless7</span><span class="o">-</span><span class="n">streaming</span><span class="o">-</span><span class="mi">2022</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span><span class="o">/</span><span class="n">test_wavs</span><span class="o">/</span><span class="mi">1089</span><span class="o">-</span><span class="mi">134686</span><span class="o">-</span><span class="mf">0001.</span><span class="n">wav</span>
|
||
<span class="mi">2023</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">27</span> <span class="mi">20</span><span class="p">:</span><span class="mi">43</span><span class="p">:</span><span class="mi">45</span><span class="p">,</span><span class="mi">026</span> <span class="n">INFO</span> <span class="p">[</span><span class="n">streaming</span><span class="o">-</span><span class="n">ncnn</span><span class="o">-</span><span class="n">decode</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">411</span><span class="p">]</span> <span class="n">AFTER</span> <span class="n">EARLY</span> <span class="n">NIGHTFALL</span> <span class="n">THE</span> <span class="n">YELLOW</span> <span class="n">LAMPS</span> <span class="n">WOULD</span> <span class="n">LIGHT</span> <span class="n">UP</span> <span class="n">HERE</span> <span class="n">AND</span> <span class="n">THERE</span> <span class="n">THE</span> <span class="n">SQUALID</span> <span class="n">QUARTER</span> <span class="n">OF</span> <span class="n">THE</span> <span class="n">BROTHELS</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Congratulations! You have successfully exported a model from PyTorch to <a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a>!</p>
|
||
</section>
|
||
<section id="modify-the-exported-encoder-for-sherpa-ncnn">
|
||
<span id="zipformer-modify-the-exported-encoder-for-sherpa-ncnn"></span><h2>6. Modify the exported encoder for sherpa-ncnn<a class="headerlink" href="#modify-the-exported-encoder-for-sherpa-ncnn" title="Permalink to this heading"></a></h2>
|
||
<p>In order to use the exported models in <a class="reference external" href="https://github.com/k2-fsa/sherpa-ncnn">sherpa-ncnn</a>, we have to modify
|
||
<code class="docutils literal notranslate"><span class="pre">encoder_jit_trace-pnnx.ncnn.param</span></code>.</p>
|
||
<p>Let us have a look at the first few lines of <code class="docutils literal notranslate"><span class="pre">encoder_jit_trace-pnnx.ncnn.param</span></code>:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="mi">7767517</span>
|
||
<span class="mi">2028</span> <span class="mi">2547</span>
|
||
<span class="n">Input</span> <span class="n">in0</span> <span class="mi">0</span> <span class="mi">1</span> <span class="n">in0</span>
|
||
</pre></div>
|
||
</div>
|
||
<p><strong>Explanation</strong> of the above three lines:</p>
|
||
<blockquote>
|
||
<div><ol class="arabic simple">
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">7767517</span></code>, it is a magic number and should not be changed.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">2028</span> <span class="pre">2547</span></code>, the first number <code class="docutils literal notranslate"><span class="pre">2028</span></code> specifies the number of layers
|
||
in this file, while <code class="docutils literal notranslate"><span class="pre">2547</span></code> specifies the number of intermediate outputs
|
||
of this file</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">Input</span> <span class="pre">in0</span> <span class="pre">0</span> <span class="pre">1</span> <span class="pre">in0</span></code>, <code class="docutils literal notranslate"><span class="pre">Input</span></code> is the layer type of this layer; <code class="docutils literal notranslate"><span class="pre">in0</span></code>
|
||
is the layer name of this layer; <code class="docutils literal notranslate"><span class="pre">0</span></code> means this layer has no input;
|
||
<code class="docutils literal notranslate"><span class="pre">1</span></code> means this layer has one output; <code class="docutils literal notranslate"><span class="pre">in0</span></code> is the output name of
|
||
this layer.</p></li>
|
||
</ol>
|
||
</div></blockquote>
|
||
<p>We need to add 1 extra line and also increment the number of layers.
|
||
The result looks like below:</p>
|
||
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="m">7767517</span>
|
||
<span class="m">2029</span><span class="w"> </span><span class="m">2547</span>
|
||
SherpaMetaData<span class="w"> </span>sherpa_meta_data1<span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="nv">0</span><span class="o">=</span><span class="m">2</span><span class="w"> </span><span class="nv">1</span><span class="o">=</span><span class="m">32</span><span class="w"> </span><span class="nv">2</span><span class="o">=</span><span class="m">4</span><span class="w"> </span><span class="nv">3</span><span class="o">=</span><span class="m">7</span><span class="w"> </span><span class="nv">15</span><span class="o">=</span><span class="m">1</span><span class="w"> </span>-23316<span class="o">=</span><span class="m">5</span>,2,4,3,2,4<span class="w"> </span>-23317<span class="o">=</span><span class="m">5</span>,384,384,384,384,384<span class="w"> </span>-23318<span class="o">=</span><span class="m">5</span>,192,192,192,192,192<span class="w"> </span>-23319<span class="o">=</span><span class="m">5</span>,1,2,4,8,2<span class="w"> </span>-23320<span class="o">=</span><span class="m">5</span>,31,31,31,31,31
|
||
Input<span class="w"> </span>in0<span class="w"> </span><span class="m">0</span><span class="w"> </span><span class="m">1</span><span class="w"> </span>in0
|
||
</pre></div>
|
||
</div>
|
||
<p><strong>Explanation</strong></p>
|
||
<blockquote>
|
||
<div><ol class="arabic">
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">7767517</span></code>, it is still the same</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">2029</span> <span class="pre">2547</span></code>, we have added an extra layer, so we need to update <code class="docutils literal notranslate"><span class="pre">2028</span></code> to <code class="docutils literal notranslate"><span class="pre">2029</span></code>.
|
||
We don’t need to change <code class="docutils literal notranslate"><span class="pre">2547</span></code> since the newly added layer has no inputs or outputs.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span>  <span class="pre">sherpa_meta_data1</span>  <span class="pre">0</span> <span class="pre">0</span> <span class="pre">0=2</span> <span class="pre">1=32</span> <span class="pre">2=4</span> <span class="pre">3=7</span> <span class="pre">-23316=5,2,4,3,2,4</span> <span class="pre">-23317=5,384,384,384,384,384</span> <span class="pre">-23318=5,192,192,192,192,192</span> <span class="pre">-23319=5,1,2,4,8,2</span> <span class="pre">-23320=5,31,31,31,31,31</span></code>
|
||
This line is newly added. Its explanation is given below:</p>
|
||
<blockquote>
|
||
<div><ul class="simple">
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span></code> is the type of this layer. Must be <code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">sherpa_meta_data1</span></code> is the name of this layer. Must be <code class="docutils literal notranslate"><span class="pre">sherpa_meta_data1</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">0</span> <span class="pre">0</span></code> means this layer has no inputs or output. Must be <code class="docutils literal notranslate"><span class="pre">0</span> <span class="pre">0</span></code></p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">0=2</span></code>, 0 is the key and 2 is the value. MUST be <code class="docutils literal notranslate"><span class="pre">0=2</span></code></p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">1=32</span></code>, 1 is the key and 32 is the value of the
|
||
parameter <code class="docutils literal notranslate"><span class="pre">--decode-chunk-len</span></code> that you provided when running
|
||
<code class="docutils literal notranslate"><span class="pre">./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">2=4</span></code>, 2 is the key and 4 is the value of the
|
||
parameter <code class="docutils literal notranslate"><span class="pre">--num-left-chunks</span></code> that you provided when running
|
||
<code class="docutils literal notranslate"><span class="pre">./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">3=7</span></code>, 3 is the key and 7 is the value of for the amount of padding
|
||
used in the Conv2DSubsampling layer. It should be 7 for zipformer
|
||
if you don’t change zipformer.py.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">15=1</span></code>, attribute 15, this is the model version. Starting from
|
||
<a class="reference external" href="https://github.com/k2-fsa/sherpa-ncnn">sherpa-ncnn</a> v2.0, we require that the model version has to
|
||
be >= 1.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">-23316=5,2,4,3,2,4</span></code>, attribute 16, this is an array attribute.
|
||
It is attribute 16 since -23300 - (-23316) = 16.
|
||
The first element of the array is the length of the array, which is 5 in our case.
|
||
<code class="docutils literal notranslate"><span class="pre">2,4,3,2,4</span></code> is the value of <code class="docutils literal notranslate"><span class="pre">--num-encoder-layers``that</span> <span class="pre">you</span> <span class="pre">provided</span>
|
||
<span class="pre">when</span> <span class="pre">running</span> <span class="pre">``./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">-23317=5,384,384,384,384,384</span></code>, attribute 17.
|
||
The first element of the array is the length of the array, which is 5 in our case.
|
||
<code class="docutils literal notranslate"><span class="pre">384,384,384,384,384</span></code> is the value of <code class="docutils literal notranslate"><span class="pre">--encoder-dims``that</span> <span class="pre">you</span> <span class="pre">provided</span>
|
||
<span class="pre">when</span> <span class="pre">running</span> <span class="pre">``./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">-23318=5,192,192,192,192,192</span></code>, attribute 18.
|
||
The first element of the array is the length of the array, which is 5 in our case.
|
||
<code class="docutils literal notranslate"><span class="pre">192,192,192,192,192</span></code> is the value of <code class="docutils literal notranslate"><span class="pre">--attention-dims</span></code> that you provided
|
||
when running <code class="docutils literal notranslate"><span class="pre">./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">-23319=5,1,2,4,8,2</span></code>, attribute 19.
|
||
The first element of the array is the length of the array, which is 5 in our case.
|
||
<code class="docutils literal notranslate"><span class="pre">1,2,4,8,2</span></code> is the value of <code class="docutils literal notranslate"><span class="pre">--zipformer-downsampling-factors</span></code> that you provided
|
||
when running <code class="docutils literal notranslate"><span class="pre">./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">-23320=5,31,31,31,31,31</span></code>, attribute 20.
|
||
The first element of the array is the length of the array, which is 5 in our case.
|
||
<code class="docutils literal notranslate"><span class="pre">31,31,31,31,31</span></code> is the value of <code class="docutils literal notranslate"><span class="pre">--cnn-module-kernels</span></code> that you provided
|
||
when running <code class="docutils literal notranslate"><span class="pre">./pruned_transducer_stateless7_streaming/export-for-ncnn.py</span></code>.</p></li>
|
||
</ul>
|
||
<p>For ease of reference, we list the key-value pairs that you need to add
|
||
in the following table. If your model has a different setting, please
|
||
change the values for <code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span></code> accordingly. Otherwise, you
|
||
will be <code class="docutils literal notranslate"><span class="pre">SAD</span></code>.</p>
|
||
<blockquote>
|
||
<div><table class="docutils align-default">
|
||
<thead>
|
||
<tr class="row-odd"><th class="head"><p>key</p></th>
|
||
<th class="head"><p>value</p></th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr class="row-even"><td><p>0</p></td>
|
||
<td><p>2 (fixed)</p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>1</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">-decode-chunk-len</span></code></p></td>
|
||
</tr>
|
||
<tr class="row-even"><td><p>2</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">--num-left-chunks</span></code></p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>3</p></td>
|
||
<td><p>7 (if you don’t change code)</p></td>
|
||
</tr>
|
||
<tr class="row-even"><td><p>15</p></td>
|
||
<td><p>1 (The model version)</p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>-23316</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">--num-encoder-layer</span></code></p></td>
|
||
</tr>
|
||
<tr class="row-even"><td><p>-23317</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">--encoder-dims</span></code></p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>-23318</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">--attention-dims</span></code></p></td>
|
||
</tr>
|
||
<tr class="row-even"><td><p>-23319</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">--zipformer-downsampling-factors</span></code></p></td>
|
||
</tr>
|
||
<tr class="row-odd"><td><p>-23320</p></td>
|
||
<td><p><code class="docutils literal notranslate"><span class="pre">--cnn-module-kernels</span></code></p></td>
|
||
</tr>
|
||
</tbody>
|
||
</table>
|
||
</div></blockquote>
|
||
</div></blockquote>
|
||
</li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">Input</span> <span class="pre">in0</span> <span class="pre">0</span> <span class="pre">1</span> <span class="pre">in0</span></code>. No need to change it.</p></li>
|
||
</ol>
|
||
</div></blockquote>
|
||
<div class="admonition caution">
|
||
<p class="admonition-title">Caution</p>
|
||
<p>When you add a new layer <code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span></code>, please remember to update the
|
||
number of layers. In our case, update <code class="docutils literal notranslate"><span class="pre">2028</span></code> to <code class="docutils literal notranslate"><span class="pre">2029</span></code>. Otherwise,
|
||
you will be SAD later.</p>
|
||
</div>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p>After adding the new layer <code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span></code>, you cannot use this model
|
||
with <code class="docutils literal notranslate"><span class="pre">streaming-ncnn-decode.py</span></code> anymore since <code class="docutils literal notranslate"><span class="pre">SherpaMetaData</span></code> is
|
||
supported only in <a class="reference external" href="https://github.com/k2-fsa/sherpa-ncnn">sherpa-ncnn</a>.</p>
|
||
</div>
|
||
<div class="admonition hint">
|
||
<p class="admonition-title">Hint</p>
|
||
<p><a class="reference external" href="https://github.com/tencent/ncnn">ncnn</a> is very flexible. You can add new layers to it just by text-editing
|
||
the <code class="docutils literal notranslate"><span class="pre">param</span></code> file! You don’t need to change the <code class="docutils literal notranslate"><span class="pre">bin</span></code> file.</p>
|
||
</div>
|
||
<p>Now you can use this model in <a class="reference external" href="https://github.com/k2-fsa/sherpa-ncnn">sherpa-ncnn</a>.
|
||
Please refer to the following documentation:</p>
|
||
<blockquote>
|
||
<div><ul class="simple">
|
||
<li><p>Linux/macOS/Windows/arm/aarch64: <a class="reference external" href="https://k2-fsa.github.io/sherpa/ncnn/install/index.html">https://k2-fsa.github.io/sherpa/ncnn/install/index.html</a></p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">Android</span></code>: <a class="reference external" href="https://k2-fsa.github.io/sherpa/ncnn/android/index.html">https://k2-fsa.github.io/sherpa/ncnn/android/index.html</a></p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">iOS</span></code>: <a class="reference external" href="https://k2-fsa.github.io/sherpa/ncnn/ios/index.html">https://k2-fsa.github.io/sherpa/ncnn/ios/index.html</a></p></li>
|
||
<li><p>Python: <a class="reference external" href="https://k2-fsa.github.io/sherpa/ncnn/python/index.html">https://k2-fsa.github.io/sherpa/ncnn/python/index.html</a></p></li>
|
||
</ul>
|
||
</div></blockquote>
|
||
<p>We have a list of pre-trained models that have been exported for <a class="reference external" href="https://github.com/k2-fsa/sherpa-ncnn">sherpa-ncnn</a>:</p>
|
||
<blockquote>
|
||
<div><ul>
|
||
<li><p><a class="reference external" href="https://k2-fsa.github.io/sherpa/ncnn/pretrained_models/index.html">https://k2-fsa.github.io/sherpa/ncnn/pretrained_models/index.html</a></p>
|
||
<p>You can find more usages there.</p>
|
||
</li>
|
||
</ul>
|
||
</div></blockquote>
|
||
</section>
|
||
</section>
|
||
|
||
|
||
</div>
|
||
</div>
|
||
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
|
||
<a href="export-ncnn.html" class="btn btn-neutral float-left" title="Export to ncnn" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||
<a href="export-ncnn-conv-emformer.html" class="btn btn-neutral float-right" title="Export ConvEmformer transducer models to ncnn" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||
</div>
|
||
|
||
<hr/>
|
||
|
||
<div role="contentinfo">
|
||
<p>© Copyright 2021, icefall development team.</p>
|
||
</div>
|
||
|
||
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
||
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
||
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||
|
||
|
||
</footer>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
</div>
|
||
<script>
|
||
jQuery(function () {
|
||
SphinxRtdTheme.Navigation.enable(true);
|
||
});
|
||
</script>
|
||
|
||
</body>
|
||
</html> |