icefall/docs/source/model-export/export-mnn-zipformer.rst

.. _export_streaming_zipformer_transducer_models_to_mnn:

Export streaming Zipformer transducer models to MNN
----------------------------------------------------

We use the pre-trained model from the following repository as an example:

`<https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t>`_

We will show you step by step how to export it to `MNN`_ and run it with `sherpa-MNN`_.

.. hint::

  We use ``Ubuntu 20.04``, ``torch 2.0.0``, and ``Python 3.8`` for testing.

.. caution::

  Please use a more recent version of PyTorch. For instance, ``torch 1.8``
  may ``not`` work.

1. Download the pre-trained model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. hint::

  You have to install `git-lfs`_ before you continue.


.. code-block:: bash

  cd egs/librispeech/ASR
  git clone https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t

  cd ..

In the above code, we downloaded the pre-trained model into the directory
``egs/librispeech/ASR/k2fsa-zipformer-bilingual-zh-en-t``.

.. _export_for_mnn_install_mnn:

2. Install MNN
^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: bash

  # We put MNN into $HOME/open-source/MNN
  # You can change it to anywhere you like

  cd $HOME
  mkdir -p open-source
  cd open-source

  git clone https://github.com/alibaba/MNN
  cd MNN
  mkdir build && cd build

  cmake \
    -DMNN_BUILD_CONVERTER=ON \
    -DMNN_BUILD_TORCH=ON \
    -DMNN_BUILD_TOOLS=ON \
    -DMNN_BUILD_BENCHMARK=ON \
    -DMNN_EVALUATION=ON \
    -DMNN_BUILD_DEMO=ON \
    -DMNN_BUILD_TEST=ON \
    -DMNN_BUILD_QUANTOOLS=ON
  ..

  make -j4

  cd ..

  # Note: $PWD here is $HOME/open-source/MNN

  export PATH=$PWD/build:$PATH

Congratulations! You have successfully installed the following components:

  - ``MNNConvert``, which is an executable located in
    ``$HOME/open-source/MNN/build``. We will use
    it to convert models from ``ONNX``.


3. Export the model to ONNX
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

First, let us rename our pre-trained model:

.. code-block::

  cd egs/librispeech/ASR

  cd k2fsa-zipformer-bilingual-zh-en-t/exp

  ln -s pretrained.pt epoch-99.pt

  cd ../..

Next, we use the following code to export our model:

.. code-block:: bash

  dir=./k2fsa-zipformer-bilingual-zh-en-t

  ./pruned_transducer_stateless7_streaming/export-onnx-zh.py \
    --tokens $dir/data/lang_char_bpe/tokens.txt \
    --exp-dir $dir/exp \
    --use-averaged-model 0 \
    --epoch 99 \
    --avg 1 \
    --decode-chunk-len 32 \
    --num-encoder-layers "2,2,2,2,2" \
    --feedforward-dims "768,768,768,768,768" \
    --nhead "4,4,4,4,4" \
    --encoder-dims "256,256,256,256,256" \
    --attention-dims "192,192,192,192,192" \
    --encoder-unmasked-dims "192,192,192,192,192" \
    --zipformer-downsampling-factors "1,2,4,8,2" \
    --cnn-module-kernels "31,31,31,31,31" \
    --decoder-dim 512 \
    --joiner-dim 512

.. caution::

  If your model has different configuration parameters, please change them accordingly.

.. hint::

  We have renamed our model to ``epoch-99.pt`` so that we can use ``--epoch 99``.
  There is only one pre-trained model, so we use ``--avg 1 --use-averaged-model 0``.

  If you have trained a model by yourself and if you have all checkpoints
  available, please first use ``decode.py`` to tune ``--epoch --avg``
  and select the best combination with with ``--use-averaged-model 1``.

After the above step, we will get the following files:

.. code-block:: bash

  ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.onnx

  .rw-rw-r--  88,435,414 meixu 2023-05-12 10:05 encoder-epoch-99-avg-1.onnx
  .rw-rw-r--  13,876,389 meixu 2023-05-12 10:05 decoder-epoch-99-avg-1.onnx
  .rw-rw-r--  12,833,674 meixu 2023-05-12 10:05 joiner-epoch-99-avg-1.onnx

.. _zipformer-transducer-step-4-export-torchscript-model-via-pnnx:

4. Convert model from onnx to MNN
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. hint::

  Make sure you have set up the ``PATH`` environment variable
  in :ref:`_export_for_mnn_install_mnn`. Otherwise,
  it will throw an error saying that ``MNNConvert`` could not be found.

Now, it's time to export our models to `MNN`_.

.. code-block::

  cd k2fsa-zipformer-bilingual-zh-en-t/exp/

  MNNConvert -f ONNX --modelFile encoder-epoch-99-avg-1.onnx --MNNModel encoder-epoch-99-avg-1.mnn --bizCode MNN
  MNNConvert -f ONNX --modelFile decoder-epoch-99-avg-1.onnx --MNNModel decoder-epoch-99-avg-1.mnn --bizCode MNN
  MNNConvert -f ONNX --modelFile joiner-epoch-99-avg-1.onnx --MNNModel joiner-epoch-99-avg-1.mnn --bizCode MNN

.. note::

  You will see the following log output:

  .. literalinclude:: ./code/export-zipformer-transducer-for-mnn-output.txt

It will generate the following files:

.. code-block:: bash

  ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.mnn

  .rw-rw-r--  12,836,004 meixu 2023-05-09 15:12 joiner-epoch-99-avg-1.mnn
  .rw-rw-r--  13,917,864 meixu 2023-05-09 15:12 decoder-epoch-99-avg-1.mnn
  .rw-rw-r--  89,065,932 meixu 2023-05-09 15:13 encoder-epoch-99-avg-1.mnn

Congratulations! You have successfully exported a model from PyTorch to `MNN`_!

Now you can use this model in `sherpa-mnn`_.
Please refer to the following documentation:

  - Linux/aarch64: `<https://k2-fsa.github.io/sherpa/mnn/install/index.html>`_