add export-mnn docs

2025-12-11 06:55:27 +00:00 · 2023-10-13 22:47:46 +08:00 · 2023-10-13 22:47:46 +08:00 · a9c22e79b4
commit a9c22e79b4
parent eeeeef390b
5 changed files with 254 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -24,6 +24,7 @@ node_modules
 # Ignore all text files
 *.txt
 !docs/source/model-export/code/*.txt
 # Ignore files related to API keys
 .env
--- a/docs/source/model-export/code/export-zipformer-transducer-for-mnn-output.txt
+++ b/docs/source/model-export/code/export-zipformer-transducer-for-mnn-output.txt
@ -0,0 +1,38 @@
 # encoder
 /audio/code/MNN/build/MNNConvert -f ONNX --modelFile encoder-epoch-99-avg-1.onnx --MNNModel encoder-epoch-99-avg-1.mnn --bizCode MNN
 The device support i8sdot:0, support fp16:0, support i8mm: 0
 Start to Convert Other Model Format To MNN Model...
 [16:52:23] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:98: ONNX Model ir version: 7
 [16:52:23] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:99: ONNX Model opset version: 13
 Start to Optimize the MNN Net...
 88 op name is empty or dup, set to Unsqueeze88
 188 op name is empty or dup, set to Unsqueeze188
 215 op name is empty or dup, set to Shape215
 ...
 inputTensors : [ x, cached_avg_0, cached_len_0, cached_key_0, cached_val_0, cached_conv1_0, cached_val2_0, cached_conv2_0, cached_avg_1, cached_len_1, cached_key_1, cached_val_1, cached_conv1_1, cached_val2_1, cached_conv2_1, cached_avg_2, cached_len_2, cached_key_2, cached_val_2, cached_conv1_2, cached_val2_2, cached_conv2_2, cached_avg_3, cached_len_3, cached_key_3, cached_val_3, cached_conv1_3, cached_val2_3, cached_conv2_3, cached_avg_4, cached_len_4, cached_key_4, cached_val_4, cached_conv1_4, cached_val2_4, cached_conv2_4, ]
 outputTensors: [ encoder_out, new_cached_avg_0, new_cached_avg_1, new_cached_avg_2, new_cached_avg_3, new_cached_avg_4, new_cached_conv1_0, new_cached_conv1_1, new_cached_conv1_2, new_cached_conv1_3, new_cached_conv1_4, new_cached_conv2_0, new_cached_conv2_1, new_cached_conv2_2, new_cached_conv2_3, new_cached_conv2_4, new_cached_key_0, new_cached_key_1, new_cached_key_2, new_cached_key_3, new_cached_key_4, new_cached_len_0, new_cached_len_1, new_cached_len_2, new_cached_len_3, new_cached_len_4, new_cached_val2_0, new_cached_val2_1, new_cached_val2_2, new_cached_val2_3, new_cached_val2_4, new_cached_val_0, new_cached_val_1, new_cached_val_2, new_cached_val_3, new_cached_val_4, ]
 Converted Success!
 # decoder
 /audio/code/MNN/build/MNNConvert -f ONNX --modelFile decoder-epoch-99-avg-1.onnx --MNNModel decoder-epoch-99-avg-1.mnn --bizCode MNN
 The device support i8sdot:0, support fp16:0, support i8mm: 0
 Start to Convert Other Model Format To MNN Model...
 [16:51:58] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:98: ONNX Model ir version: 7
 [16:51:58] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:99: ONNX Model opset version: 13
 Start to Optimize the MNN Net...
 167 op name is empty or dup, set to Unsqueeze167
 inputTensors : [ y, ]
 outputTensors: [ decoder_out, ]
 The model has subgraphs, please use MNN::Module to run it
 Converted Success!
 # joiner
 /audio/code/MNN/build/MNNConvert -f ONNX --modelFile joiner-epoch-99-avg-1.onnx --MNNModel joiner-epoch-99-avg-1.mnn --bizCode MNN
 The device support i8sdot:0, support fp16:0, support i8mm: 0
 Start to Convert Other Model Format To MNN Model...
 [16:51:01] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:98: ONNX Model ir version: 7
 [16:51:01] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:99: ONNX Model opset version: 13
 Start to Optimize the MNN Net...
 inputTensors : [ encoder_out, decoder_out, ]
 outputTensors: [ logit, ]
 Converted Success!
--- a/docs/source/model-export/export-mnn-zipformer.rst
+++ b/docs/source/model-export/export-mnn-zipformer.rst
@ -0,0 +1,187 @@
 .. _export_streaming_zipformer_transducer_models_to_mnn:
 Export streaming Zipformer transducer models to MNN
 ----------------------------------------------------
 We use the pre-trained model from the following repository as an example:
 `<https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t>`_
 We will show you step by step how to export it to `MNN`_ and run it with `sherpa-MNN`_.
 .. hint::
  We use ``Ubuntu 20.04``, ``torch 2.0.0``, and ``Python 3.8`` for testing.
 .. caution::
  Please use a more recent version of PyTorch. For instance, ``torch 1.8``
  may ``not`` work.
 1. Download the pre-trained model
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 .. hint::
  You have to install `git-lfs`_ before you continue.
 .. code-block:: bash
  cd egs/librispeech/ASR
  git clone https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t
  cd ..
 In the above code, we downloaded the pre-trained model into the directory
 ``egs/librispeech/ASR/k2fsa-zipformer-bilingual-zh-en-t``.
 .. _export_for_mnn_install_mnn:
 1. Install MNN
 ^^^^^^^^^^^^^^^^^^^^^^^^
 .. code-block:: bash
  # We put MNN into $HOME/open-source/MNN
  # You can change it to anywhere you like
  cd $HOME
  mkdir -p open-source
  cd open-source
  git clone https://github.com/alibaba/MNN
  cd MNN
  mkdir build && cd build
  cmake \
    -DMNN_BUILD_CONVERTER=ON \
    -DMNN_BUILD_TORCH=ON \
    -DMNN_BUILD_TOOLS=ON \
    -DMNN_BUILD_BENCHMARK=ON \
    -DMNN_EVALUATION=ON \
    -DMNN_BUILD_DEMO=ON \
    -DMNN_BUILD_TEST=ON \
    -DMNN_BUILD_QUANTOOLS=ON
  ..
  make -j4
  cd ..
  # Note: $PWD here is $HOME/open-source/MNN
  export PATH=$PWD/build:$PATH
 Congratulations! You have successfully installed the following components:
  - ``MNNConvert``, which is an executable located in
    ``$HOME/open-source/MNN/build``. We will use
    it to convert models from ``ONNX``.
 2. Export the model to ONNX
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 First, let us rename our pre-trained model:
 .. code-block::
  cd egs/librispeech/ASR
  cd k2fsa-zipformer-bilingual-zh-en-t/exp
  ln -s pretrained.pt epoch-99.pt
  cd ../..
 Next, we use the following code to export our model:
 .. code-block:: bash
  dir=./k2fsa-zipformer-bilingual-zh-en-t
  ./pruned_transducer_stateless7_streaming/export-onnx-zh.py \
    --tokens $dir/data/lang_char_bpe/tokens.txt \
    --exp-dir $dir/exp \
    --use-averaged-model 0 \
    --epoch 99 \
    --avg 1 \
    --decode-chunk-len 32 \
    --num-encoder-layers "2,2,2,2,2" \
    --feedforward-dims "768,768,768,768,768" \
    --nhead "4,4,4,4,4" \
    --encoder-dims "256,256,256,256,256" \
    --attention-dims "192,192,192,192,192" \
    --encoder-unmasked-dims "192,192,192,192,192" \
    --zipformer-downsampling-factors "1,2,4,8,2" \
    --cnn-module-kernels "31,31,31,31,31" \
    --decoder-dim 512 \
    --joiner-dim 512
 .. caution::
  If your model has different configuration parameters, please change them accordingly.
 .. hint::
  We have renamed our model to ``epoch-99.pt`` so that we can use ``--epoch 99``.
  There is only one pre-trained model, so we use ``--avg 1 --use-averaged-model 0``.
  If you have trained a model by yourself and if you have all checkpoints
  available, please first use ``decode.py`` to tune ``--epoch --avg``
  and select the best combination with with ``--use-averaged-model 1``.
 After the above step, we will get the following files:
 .. code-block:: bash
  ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.onnx
  .rw-rw-r--  88,435,414 meixu 2023-05-12 10:05 encoder-epoch-99-avg-1.onnx
  .rw-rw-r--  13,876,389 meixu 2023-05-12 10:05 decoder-epoch-99-avg-1.onnx
  .rw-rw-r--  12,833,674 meixu 2023-05-12 10:05 joiner-epoch-99-avg-1.onnx
 .. _zipformer-transducer-step-4-export-torchscript-model-via-pnnx:
 3. Convert model from onnx to MNN
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 .. hint::
  Make sure you have set up the ``PATH`` environment variable
  in :ref:`_export_for_mnn_install_mnn`. Otherwise,
  it will throw an error saying that ``MNNConvert`` could not be found.
 Now, it's time to export our models to `MNN`_.
 .. code-block::
  cd k2fsa-zipformer-bilingual-zh-en-t/exp/
  MNNConvert -f ONNX --modelFile encoder-epoch-99-avg-1.onnx --MNNModel encoder-epoch-99-avg-1.mnn --bizCode MNN
  MNNConvert -f ONNX --modelFile decoder-epoch-99-avg-1.onnx --MNNModel decoder-epoch-99-avg-1.mnn --bizCode MNN
  MNNConvert -f ONNX --modelFile joiner-epoch-99-avg-1.onnx --MNNModel joiner-epoch-99-avg-1.mnn --bizCode MNN
 .. note::
  You will see the following log output:
  .. literalinclude:: ./code/export-zipformer-transducer-for-mnn-output.txt
 It will generate the following files:
 .. code-block:: bash
  ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.mnn
  .rw-rw-r--  12,836,004 meixu 2023-05-09 15:12 joiner-epoch-99-avg-1.mnn
  .rw-rw-r--  13,917,864 meixu 2023-05-09 15:12 decoder-epoch-99-avg-1.mnn
  .rw-rw-r--  89,065,932 meixu 2023-05-09 15:13 encoder-epoch-99-avg-1.mnn
 Congratulations! You have successfully exported a model from PyTorch to `MNN`_!
 Now you can use this model in `sherpa-mnn`_.
 Please refer to the following documentation:
  - Linux/aarch64: `<https://k2-fsa.github.io/sherpa/mnn/install/index.html>`_
--- a/docs/source/model-export/export-mnn.rst
+++ b/docs/source/model-export/export-mnn.rst
@ -0,0 +1,27 @@
 .. _icefall_export_to_mnn:
 Export to mnn
 ==============
 We support exporting the following models
 to `mnn <https://github.com/alibaba/MNN>`_:
  - `Zipformer transducer models <https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming>`_
 We also provide `sherpa-mnn`_
 for performing speech recognition using `MNN`_ with exported models.
 It has been tested on the following platforms:
  - Linux
  - RK3588s
 `sherpa-mnn`_ is self-contained and can be statically linked to produce
 a binary containing everything needed. Please refer
 to its documentation for details:
 - `<https://k2-fsa.github.io/sherpa/mnn/index.html>`_
 .. toctree::
   export-mnn-zipformer
--- a/docs/source/model-export/index.rst
+++ b/docs/source/model-export/index.rst
@ -12,3 +12,4 @@ In this section, we describe various ways to export models.
   export-with-torch-jit-script
   export-onnx
   export-ncnn
   export-mnn