diff --git a/.gitignore b/.gitignore index fa18ca83c..8871ba049 100644 --- a/.gitignore +++ b/.gitignore @@ -24,6 +24,7 @@ node_modules # Ignore all text files *.txt +!docs/source/model-export/code/*.txt # Ignore files related to API keys .env diff --git a/docs/source/model-export/code/export-zipformer-transducer-for-mnn-output.txt b/docs/source/model-export/code/export-zipformer-transducer-for-mnn-output.txt new file mode 100644 index 000000000..fefca1f6e --- /dev/null +++ b/docs/source/model-export/code/export-zipformer-transducer-for-mnn-output.txt @@ -0,0 +1,38 @@ +# encoder +/audio/code/MNN/build/MNNConvert -f ONNX --modelFile encoder-epoch-99-avg-1.onnx --MNNModel encoder-epoch-99-avg-1.mnn --bizCode MNN +The device support i8sdot:0, support fp16:0, support i8mm: 0 +Start to Convert Other Model Format To MNN Model... +[16:52:23] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:98: ONNX Model ir version: 7 +[16:52:23] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:99: ONNX Model opset version: 13 +Start to Optimize the MNN Net... +88 op name is empty or dup, set to Unsqueeze88 +188 op name is empty or dup, set to Unsqueeze188 +215 op name is empty or dup, set to Shape215 +... +inputTensors : [ x, cached_avg_0, cached_len_0, cached_key_0, cached_val_0, cached_conv1_0, cached_val2_0, cached_conv2_0, cached_avg_1, cached_len_1, cached_key_1, cached_val_1, cached_conv1_1, cached_val2_1, cached_conv2_1, cached_avg_2, cached_len_2, cached_key_2, cached_val_2, cached_conv1_2, cached_val2_2, cached_conv2_2, cached_avg_3, cached_len_3, cached_key_3, cached_val_3, cached_conv1_3, cached_val2_3, cached_conv2_3, cached_avg_4, cached_len_4, cached_key_4, cached_val_4, cached_conv1_4, cached_val2_4, cached_conv2_4, ] +outputTensors: [ encoder_out, new_cached_avg_0, new_cached_avg_1, new_cached_avg_2, new_cached_avg_3, new_cached_avg_4, new_cached_conv1_0, new_cached_conv1_1, new_cached_conv1_2, new_cached_conv1_3, new_cached_conv1_4, new_cached_conv2_0, new_cached_conv2_1, new_cached_conv2_2, new_cached_conv2_3, new_cached_conv2_4, new_cached_key_0, new_cached_key_1, new_cached_key_2, new_cached_key_3, new_cached_key_4, new_cached_len_0, new_cached_len_1, new_cached_len_2, new_cached_len_3, new_cached_len_4, new_cached_val2_0, new_cached_val2_1, new_cached_val2_2, new_cached_val2_3, new_cached_val2_4, new_cached_val_0, new_cached_val_1, new_cached_val_2, new_cached_val_3, new_cached_val_4, ] +Converted Success! + +# decoder +/audio/code/MNN/build/MNNConvert -f ONNX --modelFile decoder-epoch-99-avg-1.onnx --MNNModel decoder-epoch-99-avg-1.mnn --bizCode MNN +The device support i8sdot:0, support fp16:0, support i8mm: 0 +Start to Convert Other Model Format To MNN Model... +[16:51:58] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:98: ONNX Model ir version: 7 +[16:51:58] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:99: ONNX Model opset version: 13 +Start to Optimize the MNN Net... +167 op name is empty or dup, set to Unsqueeze167 +inputTensors : [ y, ] +outputTensors: [ decoder_out, ] +The model has subgraphs, please use MNN::Module to run it +Converted Success! + +# joiner +/audio/code/MNN/build/MNNConvert -f ONNX --modelFile joiner-epoch-99-avg-1.onnx --MNNModel joiner-epoch-99-avg-1.mnn --bizCode MNN +The device support i8sdot:0, support fp16:0, support i8mm: 0 +Start to Convert Other Model Format To MNN Model... +[16:51:01] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:98: ONNX Model ir version: 7 +[16:51:01] /audio/code/MNN/tools/converter/source/onnx/onnxConverter.cpp:99: ONNX Model opset version: 13 +Start to Optimize the MNN Net... +inputTensors : [ encoder_out, decoder_out, ] +outputTensors: [ logit, ] +Converted Success! diff --git a/docs/source/model-export/export-mnn-zipformer.rst b/docs/source/model-export/export-mnn-zipformer.rst new file mode 100644 index 000000000..5c431a14f --- /dev/null +++ b/docs/source/model-export/export-mnn-zipformer.rst @@ -0,0 +1,187 @@ +.. _export_streaming_zipformer_transducer_models_to_mnn: + +Export streaming Zipformer transducer models to MNN +---------------------------------------------------- + +We use the pre-trained model from the following repository as an example: + +``_ + +We will show you step by step how to export it to `MNN`_ and run it with `sherpa-MNN`_. + +.. hint:: + + We use ``Ubuntu 20.04``, ``torch 2.0.0``, and ``Python 3.8`` for testing. + +.. caution:: + + Please use a more recent version of PyTorch. For instance, ``torch 1.8`` + may ``not`` work. + +1. Download the pre-trained model +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. hint:: + + You have to install `git-lfs`_ before you continue. + + +.. code-block:: bash + + cd egs/librispeech/ASR + git clone https://huggingface.co/pfluo/k2fsa-zipformer-bilingual-zh-en-t + + cd .. + +In the above code, we downloaded the pre-trained model into the directory +``egs/librispeech/ASR/k2fsa-zipformer-bilingual-zh-en-t``. + +.. _export_for_mnn_install_mnn: + +1. Install MNN +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: bash + + # We put MNN into $HOME/open-source/MNN + # You can change it to anywhere you like + + cd $HOME + mkdir -p open-source + cd open-source + + git clone https://github.com/alibaba/MNN + cd MNN + mkdir build && cd build + + cmake \ + -DMNN_BUILD_CONVERTER=ON \ + -DMNN_BUILD_TORCH=ON \ + -DMNN_BUILD_TOOLS=ON \ + -DMNN_BUILD_BENCHMARK=ON \ + -DMNN_EVALUATION=ON \ + -DMNN_BUILD_DEMO=ON \ + -DMNN_BUILD_TEST=ON \ + -DMNN_BUILD_QUANTOOLS=ON + .. + + make -j4 + + cd .. + + # Note: $PWD here is $HOME/open-source/MNN + + export PATH=$PWD/build:$PATH + +Congratulations! You have successfully installed the following components: + + - ``MNNConvert``, which is an executable located in + ``$HOME/open-source/MNN/build``. We will use + it to convert models from ``ONNX``. + + +2. Export the model to ONNX +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +First, let us rename our pre-trained model: + +.. code-block:: + + cd egs/librispeech/ASR + + cd k2fsa-zipformer-bilingual-zh-en-t/exp + + ln -s pretrained.pt epoch-99.pt + + cd ../.. + +Next, we use the following code to export our model: + +.. code-block:: bash + + dir=./k2fsa-zipformer-bilingual-zh-en-t + + ./pruned_transducer_stateless7_streaming/export-onnx-zh.py \ + --tokens $dir/data/lang_char_bpe/tokens.txt \ + --exp-dir $dir/exp \ + --use-averaged-model 0 \ + --epoch 99 \ + --avg 1 \ + --decode-chunk-len 32 \ + --num-encoder-layers "2,2,2,2,2" \ + --feedforward-dims "768,768,768,768,768" \ + --nhead "4,4,4,4,4" \ + --encoder-dims "256,256,256,256,256" \ + --attention-dims "192,192,192,192,192" \ + --encoder-unmasked-dims "192,192,192,192,192" \ + --zipformer-downsampling-factors "1,2,4,8,2" \ + --cnn-module-kernels "31,31,31,31,31" \ + --decoder-dim 512 \ + --joiner-dim 512 + +.. caution:: + + If your model has different configuration parameters, please change them accordingly. + +.. hint:: + + We have renamed our model to ``epoch-99.pt`` so that we can use ``--epoch 99``. + There is only one pre-trained model, so we use ``--avg 1 --use-averaged-model 0``. + + If you have trained a model by yourself and if you have all checkpoints + available, please first use ``decode.py`` to tune ``--epoch --avg`` + and select the best combination with with ``--use-averaged-model 1``. + +After the above step, we will get the following files: + +.. code-block:: bash + + ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.onnx + + .rw-rw-r-- 88,435,414 meixu 2023-05-12 10:05 encoder-epoch-99-avg-1.onnx + .rw-rw-r-- 13,876,389 meixu 2023-05-12 10:05 decoder-epoch-99-avg-1.onnx + .rw-rw-r-- 12,833,674 meixu 2023-05-12 10:05 joiner-epoch-99-avg-1.onnx + +.. _zipformer-transducer-step-4-export-torchscript-model-via-pnnx: + +3. Convert model from onnx to MNN +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. hint:: + + Make sure you have set up the ``PATH`` environment variable + in :ref:`_export_for_mnn_install_mnn`. Otherwise, + it will throw an error saying that ``MNNConvert`` could not be found. + +Now, it's time to export our models to `MNN`_. + +.. code-block:: + + cd k2fsa-zipformer-bilingual-zh-en-t/exp/ + + MNNConvert -f ONNX --modelFile encoder-epoch-99-avg-1.onnx --MNNModel encoder-epoch-99-avg-1.mnn --bizCode MNN + MNNConvert -f ONNX --modelFile decoder-epoch-99-avg-1.onnx --MNNModel decoder-epoch-99-avg-1.mnn --bizCode MNN + MNNConvert -f ONNX --modelFile joiner-epoch-99-avg-1.onnx --MNNModel joiner-epoch-99-avg-1.mnn --bizCode MNN + +.. note:: + + You will see the following log output: + + .. literalinclude:: ./code/export-zipformer-transducer-for-mnn-output.txt + +It will generate the following files: + +.. code-block:: bash + + ls -lh k2fsa-zipformer-bilingual-zh-en-t/exp/*.mnn + + .rw-rw-r-- 12,836,004 meixu 2023-05-09 15:12 joiner-epoch-99-avg-1.mnn + .rw-rw-r-- 13,917,864 meixu 2023-05-09 15:12 decoder-epoch-99-avg-1.mnn + .rw-rw-r-- 89,065,932 meixu 2023-05-09 15:13 encoder-epoch-99-avg-1.mnn + +Congratulations! You have successfully exported a model from PyTorch to `MNN`_! + +Now you can use this model in `sherpa-mnn`_. +Please refer to the following documentation: + + - Linux/aarch64: ``_ diff --git a/docs/source/model-export/export-mnn.rst b/docs/source/model-export/export-mnn.rst new file mode 100644 index 000000000..3cb76d441 --- /dev/null +++ b/docs/source/model-export/export-mnn.rst @@ -0,0 +1,27 @@ +.. _icefall_export_to_mnn: + +Export to mnn +============== + +We support exporting the following models +to `mnn `_: + + - `Zipformer transducer models `_ + +We also provide `sherpa-mnn`_ +for performing speech recognition using `MNN`_ with exported models. +It has been tested on the following platforms: + + - Linux + - RK3588s + +`sherpa-mnn`_ is self-contained and can be statically linked to produce +a binary containing everything needed. Please refer +to its documentation for details: + + - ``_ + + +.. toctree:: + + export-mnn-zipformer diff --git a/docs/source/model-export/index.rst b/docs/source/model-export/index.rst index 9b7a2ee2d..b3754e267 100644 --- a/docs/source/model-export/index.rst +++ b/docs/source/model-export/index.rst @@ -12,3 +12,4 @@ In this section, we describe various ways to export models. export-with-torch-jit-script export-onnx export-ncnn + export-mnn