mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-09-06 23:54:17 +00:00
188 lines
5.1 KiB
ReStructuredText
188 lines
5.1 KiB
ReStructuredText
.. _export_streaming_zipformer_transducer_models_to_mnn:
|
|
|
|
Export streaming Zipformer transducer models to MNN
|
|
----------------------------------------------------
|
|
|
|
We use the pre-trained model from the following repository as an example:
|
|
|
|
`<https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t>`_
|
|
|
|
We will show you step by step how to export it to `MNN`_ and run it with `sherpa-MNN`_.
|
|
|
|
.. hint::
|
|
|
|
We use ``Ubuntu 20.04``, ``torch 2.0.0``, and ``Python 3.8`` for testing.
|
|
|
|
.. caution::
|
|
|
|
Please use a more recent version of PyTorch. For instance, ``torch 1.8``
|
|
may ``not`` work.
|
|
|
|
1. Download the pre-trained model
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. hint::
|
|
|
|
You have to install `git-lfs`_ before you continue.
|
|
|
|
|
|
.. code-block:: bash
|
|
|
|
cd egs/librispeech/ASR
|
|
git clone https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t
|
|
|
|
cd ..
|
|
|
|
In the above code, we downloaded the pre-trained model into the directory
|
|
``egs/librispeech/ASR/k2fsa-zipformer-bilingual-zh-en-t``.
|
|
|
|
.. _export_for_mnn_install_mnn:
|
|
|
|
2. Install MNN
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. code-block:: bash
|
|
|
|
# We put MNN into $HOME/open-source/MNN
|
|
# You can change it to anywhere you like
|
|
|
|
cd $HOME
|
|
mkdir -p open-source
|
|
cd open-source
|
|
|
|
git clone https://github.com/alibaba/MNN
|
|
cd MNN
|
|
mkdir build && cd build
|
|
|
|
cmake \
|
|
-DMNN_BUILD_CONVERTER=ON \
|
|
-DMNN_BUILD_TORCH=ON \
|
|
-DMNN_BUILD_TOOLS=ON \
|
|
-DMNN_BUILD_BENCHMARK=ON \
|
|
-DMNN_EVALUATION=ON \
|
|
-DMNN_BUILD_DEMO=ON \
|
|
-DMNN_BUILD_TEST=ON \
|
|
-DMNN_BUILD_QUANTOOLS=ON
|
|
..
|
|
|
|
make -j4
|
|
|
|
cd ..
|
|
|
|
# Note: $PWD here is $HOME/open-source/MNN
|
|
|
|
export PATH=$PWD/build:$PATH
|
|
|
|
Congratulations! You have successfully installed the following components:
|
|
|
|
- ``MNNConvert``, which is an executable located in
|
|
``$HOME/open-source/MNN/build``. We will use
|
|
it to convert models from ``ONNX``.
|
|
|
|
|
|
3. Export the model to ONNX
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
First, let us rename our pre-trained model:
|
|
|
|
.. code-block::
|
|
|
|
cd egs/librispeech/ASR
|
|
|
|
cd k2fsa-zipformer-bilingual-zh-en-t/exp
|
|
|
|
ln -s pretrained.pt epoch-99.pt
|
|
|
|
cd ../..
|
|
|
|
Next, we use the following code to export our model:
|
|
|
|
.. code-block:: bash
|
|
|
|
dir=./k2fsa-zipformer-bilingual-zh-en-t
|
|
|
|
./pruned_transducer_stateless7_streaming/export-onnx-zh.py \
|
|
--tokens $dir/data/lang_char_bpe/tokens.txt \
|
|
--exp-dir $dir/exp \
|
|
--use-averaged-model 0 \
|
|
--epoch 99 \
|
|
--avg 1 \
|
|
--decode-chunk-len 32 \
|
|
--num-encoder-layers "2,2,2,2,2" \
|
|
--feedforward-dims "768,768,768,768,768" \
|
|
--nhead "4,4,4,4,4" \
|
|
--encoder-dims "256,256,256,256,256" \
|
|
--attention-dims "192,192,192,192,192" \
|
|
--encoder-unmasked-dims "192,192,192,192,192" \
|
|
--zipformer-downsampling-factors "1,2,4,8,2" \
|
|
--cnn-module-kernels "31,31,31,31,31" \
|
|
--decoder-dim 512 \
|
|
--joiner-dim 512
|
|
|
|
.. caution::
|
|
|
|
If your model has different configuration parameters, please change them accordingly.
|
|
|
|
.. hint::
|
|
|
|
We have renamed our model to ``epoch-99.pt`` so that we can use ``--epoch 99``.
|
|
There is only one pre-trained model, so we use ``--avg 1 --use-averaged-model 0``.
|
|
|
|
If you have trained a model by yourself and if you have all checkpoints
|
|
available, please first use ``decode.py`` to tune ``--epoch --avg``
|
|
and select the best combination with with ``--use-averaged-model 1``.
|
|
|
|
After the above step, we will get the following files:
|
|
|
|
.. code-block:: bash
|
|
|
|
ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.onnx
|
|
|
|
.rw-rw-r-- 88,435,414 meixu 2023-05-12 10:05 encoder-epoch-99-avg-1.onnx
|
|
.rw-rw-r-- 13,876,389 meixu 2023-05-12 10:05 decoder-epoch-99-avg-1.onnx
|
|
.rw-rw-r-- 12,833,674 meixu 2023-05-12 10:05 joiner-epoch-99-avg-1.onnx
|
|
|
|
.. _zipformer-transducer-step-4-export-torchscript-model-via-pnnx:
|
|
|
|
4. Convert model from onnx to MNN
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. hint::
|
|
|
|
Make sure you have set up the ``PATH`` environment variable
|
|
in :ref:`_export_for_mnn_install_mnn`. Otherwise,
|
|
it will throw an error saying that ``MNNConvert`` could not be found.
|
|
|
|
Now, it's time to export our models to `MNN`_.
|
|
|
|
.. code-block::
|
|
|
|
cd k2fsa-zipformer-bilingual-zh-en-t/exp/
|
|
|
|
MNNConvert -f ONNX --modelFile encoder-epoch-99-avg-1.onnx --MNNModel encoder-epoch-99-avg-1.mnn --bizCode MNN
|
|
MNNConvert -f ONNX --modelFile decoder-epoch-99-avg-1.onnx --MNNModel decoder-epoch-99-avg-1.mnn --bizCode MNN
|
|
MNNConvert -f ONNX --modelFile joiner-epoch-99-avg-1.onnx --MNNModel joiner-epoch-99-avg-1.mnn --bizCode MNN
|
|
|
|
.. note::
|
|
|
|
You will see the following log output:
|
|
|
|
.. literalinclude:: ./code/export-zipformer-transducer-for-mnn-output.txt
|
|
|
|
It will generate the following files:
|
|
|
|
.. code-block:: bash
|
|
|
|
ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.mnn
|
|
|
|
.rw-rw-r-- 89,065,932 meixu 2023-05-09 15:13 encoder-epoch-99-avg-1.mnn
|
|
.rw-rw-r-- 13,917,864 meixu 2023-05-09 15:12 decoder-epoch-99-avg-1.mnn
|
|
.rw-rw-r-- 12,836,004 meixu 2023-05-09 15:12 joiner-epoch-99-avg-1.mnn
|
|
|
|
Congratulations! You have successfully exported a model from PyTorch to `MNN`_!
|
|
|
|
Now you can use this model in `sherpa-mnn`_.
|
|
Please refer to the following documentation:
|
|
|
|
- Linux/aarch64: `<https://k2-fsa.github.io/sherpa/mnn/install/index.html>`_
|