diff --git a/_sources/index.rst.txt b/_sources/index.rst.txt index 29491e3dc..be9977ca9 100644 --- a/_sources/index.rst.txt +++ b/_sources/index.rst.txt @@ -21,6 +21,7 @@ speech recognition recipes using `k2 `_. :caption: Contents: installation/index + model-export/index recipes/index contributing/index huggingface/index diff --git a/_sources/model-export/export-model-state-dict.rst.txt b/_sources/model-export/export-model-state-dict.rst.txt new file mode 100644 index 000000000..c3bbd5708 --- /dev/null +++ b/_sources/model-export/export-model-state-dict.rst.txt @@ -0,0 +1,135 @@ +Export model.state_dict() +========================= + +When to use it +-------------- + +During model training, we save checkpoints periodically to disk. + +A checkpoint contains the following information: + + - ``model.state_dict()`` + - ``optimizer.state_dict()`` + - and some other information related to training + +When we need to resume the training process from some point, we need a checkpoint. +However, if we want to publish the model for inference, then only +``model.state_dict()`` is needed. In this case, we need to strip all other information +except ``model.state_dict()`` to reduce the file size of the published model. + +How to export +------------- + +Every recipe contains a file ``export.py`` that you can use to +export ``model.state_dict()`` by taking some checkpoints as inputs. + +.. hint:: + + Each ``export.py`` contains well-documented usage information. + +In the following, we use +``_ +as an example. + +.. note:: + + The steps for other recipes are almost the same. + +.. code-block:: bash + + cd egs/librispeech/ASR + + ./pruned_transducer_stateless3/export.py \ + --exp-dir ./pruned_transducer_stateless3/exp \ + --bpe-model data/lang_bpe_500/bpe.model \ + --epoch 20 \ + --avg 10 + +will generate a file ``pruned_transducer_stateless3/exp/pretrained.pt``, which +is a dict containing ``{"model": model.state_dict()}`` saved by ``torch.save()``. + +How to use the exported model +----------------------------- + +For each recipe, we provide pretrained models hosted on huggingface. +You can find links to pretrained models in ``RESULTS.md`` of each dataset. + +In the following, we demonstrate how to use the pretrained model from +``_. + +.. code-block:: bash + + cd egs/librispeech/ASR + + git lfs install + git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13 + +After cloning the repo with ``git lfs``, you will find several files in the folder +``icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp`` +that have a prefix ``pretrained-``. Those files contain ``model.state_dict()`` +exported by the above ``export.py``. + +In each recipe, there is also a file ``pretrained.py``, which can use +``pretrained-xxx.pt`` to decode waves. The following is an example: + +.. code-block:: bash + + cd egs/librispeech/ASR + + ./pruned_transducer_stateless3/pretrained.py \ + --checkpoint ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/pretrained-iter-1224000-avg-14.pt \ + --bpe-model ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500/bpe.model \ + --method greedy_search \ + ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav \ + ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav \ + ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav + +The above commands show how to use the exported model with ``pretrained.py`` to +decode multiple sound files. Its output is given as follows for reference: + +.. literalinclude:: ./code/export-model-state-dict-pretrained-out.txt + +Use the exported model to run decode.py +--------------------------------------- + +When we publish the model, we always note down its WERs on some test +dataset in ``RESULTS.md``. This section describes how to use the +pretrained model to reproduce the WER. + +.. code-block:: bash + + cd egs/librispeech/ASR + git lfs install + git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13 + + cd icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp + ln -s pretrained-iter-1224000-avg-14.pt epoch-9999.pt + cd ../.. + +We create a symlink with name ``epoch-9999.pt`` to ``pretrained-iter-1224000-avg-14.pt``, +so that we can pass ``--epoch 9999 --avg 1`` to ``decode.py`` in the following +command: + +.. code-block:: bash + + ./pruned_transducer_stateless3/decode.py \ + --epoch 9999 \ + --avg 1 \ + --exp-dir ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp \ + --lang-dir ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500 \ + --max-duration 600 \ + --decoding-method greedy_search + +You will find the decoding results in +``./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/greedy_search``. + +.. caution:: + + For some recipes, you also need to pass ``--use-averaged-model False`` + to ``decode.py``. The reason is that the exported pretrained model is already + the averaged one. + +.. hint:: + + Before running ``decode.py``, we assume that you have already run + ``prepare.sh`` to prepare the test dataset. diff --git a/_sources/model-export/export-ncnn.rst.txt b/_sources/model-export/export-ncnn.rst.txt new file mode 100644 index 000000000..3dbb8b514 --- /dev/null +++ b/_sources/model-export/export-ncnn.rst.txt @@ -0,0 +1,12 @@ +Export to ncnn +============== + +We support exporting LSTM transducer models to `ncnn `_. + +Please refer to :ref:`export-model-for-ncnn` for details. + +We also provide ``_ +performing speech recognition using ``ncnn`` with exported models. +It has been tested on Linux, macOS, Windows, and Raspberry Pi. The project is +self-contained and can be statically linked to produce a binary containing +everything needed. diff --git a/_sources/model-export/export-onnx.rst.txt b/_sources/model-export/export-onnx.rst.txt new file mode 100644 index 000000000..dd4b3437a --- /dev/null +++ b/_sources/model-export/export-onnx.rst.txt @@ -0,0 +1,69 @@ +Export to ONNX +============== + +In this section, we describe how to export models to ONNX. + +.. hint:: + + Only non-streaming conformer transducer models are tested. + + +When to use it +-------------- + +It you want to use an inference framework that supports ONNX +to run the pretrained model. + + +How to export +------------- + +We use +``_ +as an example in the following. + +.. code-block:: bash + + cd egs/librispeech/ASR + epoch=14 + avg=2 + + ./pruned_transducer_stateless3/export.py \ + --exp-dir ./pruned_transducer_stateless3/exp \ + --bpe-model data/lang_bpe_500/bpe.model \ + --epoch $epoch \ + --avg $avg \ + --onnx 1 + +It will generate the following files inside ``pruned_transducer_stateless3/exp``: + + - ``encoder.onnx`` + - ``decoder.onnx`` + - ``joiner.onnx`` + - ``joiner_encoder_proj.onnx`` + - ``joiner_decoder_proj.onnx`` + +You can use ``./pruned_transducer_stateless3/exp/onnx_pretrained.py`` to decode +waves with the generated files: + +.. code-block:: bash + + ./pruned_transducer_stateless3/onnx_pretrained.py \ + --bpe-model ./data/lang_bpe_500/bpe.model \ + --encoder-model-filename ./pruned_transducer_stateless3/exp/encoder.onnx \ + --decoder-model-filename ./pruned_transducer_stateless3/exp/decoder.onnx \ + --joiner-model-filename ./pruned_transducer_stateless3/exp/joiner.onnx \ + --joiner-encoder-proj-model-filename ./pruned_transducer_stateless3/exp/joiner_encoder_proj.onnx \ + --joiner-decoder-proj-model-filename ./pruned_transducer_stateless3/exp/joiner_decoder_proj.onnx \ + /path/to/foo.wav \ + /path/to/bar.wav \ + /path/to/baz.wav + + +How to use the exported model +----------------------------- + +We also provide ``_ +performing speech recognition using `onnxruntime `_ +with exported models. +It has been tested on Linux, macOS, and Windows. diff --git a/_sources/model-export/export-with-torch-jit-script.rst.txt b/_sources/model-export/export-with-torch-jit-script.rst.txt new file mode 100644 index 000000000..a041dc1d5 --- /dev/null +++ b/_sources/model-export/export-with-torch-jit-script.rst.txt @@ -0,0 +1,58 @@ +.. _export-model-with-torch-jit-script: + +Export model with torch.jit.script() +=================================== + +In this section, we describe how to export a model via +``torch.jit.script()``. + +When to use it +-------------- + +If we want to use our trained model with torchscript, +we can use ``torch.jit.script()``. + +.. hint:: + + See :ref:`export-model-with-torch-jit-trace` + if you want to use ``torch.jit.trace()``. + +How to export +------------- + +We use +``_ +as an example in the following. + +.. code-block:: bash + + cd egs/librispeech/ASR + epoch=14 + avg=1 + + ./pruned_transducer_stateless3/export.py \ + --exp-dir ./pruned_transducer_stateless3/exp \ + --bpe-model data/lang_bpe_500/bpe.model \ + --epoch $epoch \ + --avg $avg \ + --jit 1 + +It will generate a file ``cpu_jit.pt`` in ``pruned_transducer_stateless3/exp``. + +.. caution:: + + Don't be confused by ``cpu`` in ``cpu_jit.pt``. We move all parameters + to CPU before saving it into a ``pt`` file; that's why we use ``cpu`` + in the filename. + +How to use the exported model +----------------------------- + +Please refer to the following pages for usage: + +- ``_ +- ``_ +- ``_ +- ``_ +- ``_ +- ``_ diff --git a/_sources/model-export/export-with-torch-jit-trace.rst.txt b/_sources/model-export/export-with-torch-jit-trace.rst.txt new file mode 100644 index 000000000..506459909 --- /dev/null +++ b/_sources/model-export/export-with-torch-jit-trace.rst.txt @@ -0,0 +1,69 @@ +.. _export-model-with-torch-jit-trace: + +Export model with torch.jit.trace() +=================================== + +In this section, we describe how to export a model via +``torch.jit.trace()``. + +When to use it +-------------- + +If we want to use our trained model with torchscript, +we can use ``torch.jit.trace()``. + +.. hint:: + + See :ref:`export-model-with-torch-jit-script` + if you want to use ``torch.jit.script()``. + +How to export +------------- + +We use +``_ +as an example in the following. + +.. code-block:: bash + + iter=468000 + avg=16 + + cd egs/librispeech/ASR + + ./lstm_transducer_stateless2/export.py \ + --exp-dir ./lstm_transducer_stateless2/exp \ + --bpe-model data/lang_bpe_500/bpe.model \ + --iter $iter \ + --avg $avg \ + --jit-trace 1 + +It will generate three files inside ``lstm_transducer_stateless2/exp``: + + - ``encoder_jit_trace.pt`` + - ``decoder_jit_trace.pt`` + - ``joiner_jit_trace.pt`` + +You can use +``_ +to decode sound files with the following commands: + +.. code-block:: bash + + cd egs/librispeech/ASR + ./lstm_transducer_stateless2/jit_pretrained.py \ + --bpe-model ./data/lang_bpe_500/bpe.model \ + --encoder-model-filename ./lstm_transducer_stateless2/exp/encoder_jit_trace.pt \ + --decoder-model-filename ./lstm_transducer_stateless2/exp/decoder_jit_trace.pt \ + --joiner-model-filename ./lstm_transducer_stateless2/exp/joiner_jit_trace.pt \ + /path/to/foo.wav \ + /path/to/bar.wav \ + /path/to/baz.wav + +How to use the exported models +------------------------------ + +Please refer to +``_ +for its usage in `sherpa `_. +You can also find pretrained models there. diff --git a/_sources/model-export/index.rst.txt b/_sources/model-export/index.rst.txt new file mode 100644 index 000000000..9b7a2ee2d --- /dev/null +++ b/_sources/model-export/index.rst.txt @@ -0,0 +1,14 @@ +Model export +============ + +In this section, we describe various ways to export models. + + + +.. toctree:: + + export-model-state-dict + export-with-torch-jit-trace + export-with-torch-jit-script + export-onnx + export-ncnn diff --git a/_sources/recipes/librispeech/lstm_pruned_stateless_transducer.rst.txt b/_sources/recipes/librispeech/lstm_pruned_stateless_transducer.rst.txt index b9d5bdcba..643855cc2 100644 --- a/_sources/recipes/librispeech/lstm_pruned_stateless_transducer.rst.txt +++ b/_sources/recipes/librispeech/lstm_pruned_stateless_transducer.rst.txt @@ -515,6 +515,8 @@ To use the generated files with ``./lstm_transducer_stateless2/jit_pretrained``: Please see ``_ for how to use the exported models in ``sherpa``. +.. _export-model-for-ncnn: + Export model for ncnn ~~~~~~~~~~~~~~~~~~~~~ diff --git a/contributing/code-style.html b/contributing/code-style.html index 514e1c12d..8e3785bf5 100644 --- a/contributing/code-style.html +++ b/contributing/code-style.html @@ -42,6 +42,7 @@

Contents:

  • Installation
  • +
  • Model export
  • Recipes
  • Contributing
    • Contributing to Documentation
    • diff --git a/contributing/doc.html b/contributing/doc.html index 39fa0c6ea..b965da43c 100644 --- a/contributing/doc.html +++ b/contributing/doc.html @@ -42,6 +42,7 @@

      Contents:

      • Installation
      • +
      • Model export
      • Recipes
      • Contributing
        • Contributing to Documentation
        • diff --git a/contributing/how-to-create-a-recipe.html b/contributing/how-to-create-a-recipe.html index b82597c54..528ede4de 100644 --- a/contributing/how-to-create-a-recipe.html +++ b/contributing/how-to-create-a-recipe.html @@ -42,6 +42,7 @@

          Contents:

          • Installation
          • +
          • Model export
          • Recipes
          • Contributing
            • Contributing to Documentation
            • diff --git a/contributing/index.html b/contributing/index.html index 2d6005c7a..b9754b93c 100644 --- a/contributing/index.html +++ b/contributing/index.html @@ -42,6 +42,7 @@

              Contents:

              • Installation
              • +
              • Model export
              • Recipes
              • Contributing
                • Contributing to Documentation
                • diff --git a/genindex.html b/genindex.html index 47fa5b42e..c81e4666e 100644 --- a/genindex.html +++ b/genindex.html @@ -39,6 +39,7 @@

                  Contents:

                  • Installation
                  • +
                  • Model export
                  • Recipes
                  • Contributing
                  • Huggingface
                  • diff --git a/huggingface/index.html b/huggingface/index.html index 1c0bac312..1b8085f25 100644 --- a/huggingface/index.html +++ b/huggingface/index.html @@ -42,6 +42,7 @@

                    Contents:

                    • Installation
                    • +
                    • Model export
                    • Recipes
                    • Contributing
                    • Huggingface
                        diff --git a/huggingface/pretrained-models.html b/huggingface/pretrained-models.html index a96d50cc3..4103606f6 100644 --- a/huggingface/pretrained-models.html +++ b/huggingface/pretrained-models.html @@ -42,6 +42,7 @@

                        Contents:

                        • Installation
                        • +
                        • Model export
                        • Recipes
                        • Contributing
                        • Huggingface
                            diff --git a/huggingface/spaces.html b/huggingface/spaces.html index f6e662956..1f719cb49 100644 --- a/huggingface/spaces.html +++ b/huggingface/spaces.html @@ -41,6 +41,7 @@

                            Contents:

                            • Installation
                            • +
                            • Model export
                            • Recipes
                            • Contributing
                            • Huggingface
                                diff --git a/index.html b/index.html index 4327402c4..5879713b2 100644 --- a/index.html +++ b/index.html @@ -41,6 +41,7 @@

                                Contents:

                                +
                              • Model export +
                              • Recipes
                                • aishell
                                • LibriSpeech
                                • diff --git a/installation/index.html b/installation/index.html index 09af4c947..692db633f 100644 --- a/installation/index.html +++ b/installation/index.html @@ -20,7 +20,7 @@ - + @@ -63,6 +63,7 @@
                                • YouTube Video
                              • +
                              • Model export
                              • Recipes
                              • Contributing
                              • Huggingface
                              • @@ -547,7 +548,7 @@ the following YouTube channel by - +
                                diff --git a/model-export/export-model-state-dict.html b/model-export/export-model-state-dict.html new file mode 100644 index 000000000..4f3c708e9 --- /dev/null +++ b/model-export/export-model-state-dict.html @@ -0,0 +1,262 @@ + + + + + + + Export model.state_dict() — icefall 0.1 documentation + + + + + + + + + + + + + + + + + + +
                                + + +
                                + +
                                +
                                +
                                + +
                                +
                                +
                                +
                                + +
                                +

                                Export model.state_dict()

                                +
                                +

                                When to use it

                                +

                                During model training, we save checkpoints periodically to disk.

                                +

                                A checkpoint contains the following information:

                                +
                                +
                                  +
                                • model.state_dict()

                                • +
                                • optimizer.state_dict()

                                • +
                                • and some other information related to training

                                • +
                                +
                                +

                                When we need to resume the training process from some point, we need a checkpoint. +However, if we want to publish the model for inference, then only +model.state_dict() is needed. In this case, we need to strip all other information +except model.state_dict() to reduce the file size of the published model.

                                +
                                +
                                +

                                How to export

                                +

                                Every recipe contains a file export.py that you can use to +export model.state_dict() by taking some checkpoints as inputs.

                                +
                                +

                                Hint

                                +

                                Each export.py contains well-documented usage information.

                                +
                                +

                                In the following, we use +https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless3/export.py +as an example.

                                +
                                +

                                Note

                                +

                                The steps for other recipes are almost the same.

                                +
                                +
                                cd egs/librispeech/ASR
                                +
                                +./pruned_transducer_stateless3/export.py \
                                +  --exp-dir ./pruned_transducer_stateless3/exp \
                                +  --bpe-model data/lang_bpe_500/bpe.model \
                                +  --epoch 20 \
                                +  --avg 10
                                +
                                +
                                +

                                will generate a file pruned_transducer_stateless3/exp/pretrained.pt, which +is a dict containing {"model": model.state_dict()} saved by torch.save().

                                +
                                +
                                +

                                How to use the exported model

                                +

                                For each recipe, we provide pretrained models hosted on huggingface. +You can find links to pretrained models in RESULTS.md of each dataset.

                                +

                                In the following, we demonstrate how to use the pretrained model from +https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13.

                                +
                                cd egs/librispeech/ASR
                                +
                                +git lfs install
                                +git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13
                                +
                                +
                                +

                                After cloning the repo with git lfs, you will find several files in the folder +icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp +that have a prefix pretrained-. Those files contain model.state_dict() +exported by the above export.py.

                                +

                                In each recipe, there is also a file pretrained.py, which can use +pretrained-xxx.pt to decode waves. The following is an example:

                                +
                                cd egs/librispeech/ASR
                                +
                                +./pruned_transducer_stateless3/pretrained.py \
                                +   --checkpoint ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/pretrained-iter-1224000-avg-14.pt \
                                +   --bpe-model ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500/bpe.model \
                                +   --method greedy_search \
                                +   ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav \
                                +   ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav \
                                +   ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav
                                +
                                +
                                +

                                The above commands show how to use the exported model with pretrained.py to +decode multiple sound files. Its output is given as follows for reference:

                                +
                                2022-10-13 19:09:02,233 INFO [pretrained.py:265] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'encoder_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'decoder_dim': 512, 'joiner_dim': 512, 'model_warm_step': 3000, 'env_info': {'k2-version': '1.21', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '4810e00d8738f1a21278b0156a42ff396a2d40ac', 'k2-git-date': 'Fri Oct 7 19:35:03 2022', 'lhotse-version': '1.3.0.dev+missing.version.file', 'torch-version': '1.10.0+cu102', 'torch-cuda-available': False, 'torch-cuda-version': '10.2', 'python-version': '3.8', 'icefall-git-branch': 'onnx-doc-1013', 'icefall-git-sha1': 'c39cba5-dirty', 'icefall-git-date': 'Thu Oct 13 15:17:20 2022', 'icefall-path': '/k2-dev/fangjun/open-source/icefall-master', 'k2-path': '/k2-dev/fangjun/open-source/k2-master/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-fj/fangjun/open-source-2/lhotse-jsonl/lhotse/__init__.py', 'hostname': 'de-74279-k2-test-4-0324160024-65bfd8b584-jjlbn', 'IP address': '10.177.74.203'}, 'checkpoint': './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/pretrained-iter-1224000-avg-14.pt', 'bpe_model': './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500/bpe.model', 'method': 'greedy_search', 'sound_files': ['./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav'], 'sample_rate': 16000, 'beam_size': 4, 'beam': 4, 'max_contexts': 4, 'max_states': 8, 'context_size': 2, 'max_sym_per_frame': 1, 'simulate_streaming': False, 'decode_chunk_size': 16, 'left_context': 64, 'dynamic_chunk_training': False, 'causal_convolution': False, 'short_chunk_size': 25, 'num_left_chunks': 4, 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
                                +2022-10-13 19:09:02,233 INFO [pretrained.py:271] device: cpu
                                +2022-10-13 19:09:02,233 INFO [pretrained.py:273] Creating model
                                +2022-10-13 19:09:02,612 INFO [train.py:458] Disable giga
                                +2022-10-13 19:09:02,623 INFO [pretrained.py:277] Number of model parameters: 78648040
                                +2022-10-13 19:09:02,951 INFO [pretrained.py:285] Constructing Fbank computer
                                +2022-10-13 19:09:02,952 INFO [pretrained.py:295] Reading sound files: ['./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav']
                                +2022-10-13 19:09:02,957 INFO [pretrained.py:301] Decoding started
                                +2022-10-13 19:09:06,700 INFO [pretrained.py:329] Using greedy_search
                                +2022-10-13 19:09:06,912 INFO [pretrained.py:388]
                                +./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav:
                                +AFTER EARLY NIGHTFALL THE YELLOW LAMPS WOULD LIGHT UP HERE AND THERE THE SQUALID QUARTER OF THE BROTHELS
                                +
                                +./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav:
                                +GOD AS A DIRECT CONSEQUENCE OF THE SIN WHICH MAN THUS PUNISHED HAD GIVEN HER A LOVELY CHILD WHOSE PLACE WAS ON THAT SAME DISHONORED BOSOM TO CONNECT HER PARENT FOREVER WITH THE RACE AND DESCENT OF MORTALS AND TO BE FINALLY A BLESSED SOUL IN HEAVEN
                                +
                                +./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav:
                                +YET THESE THOUGHTS AFFECTED HESTER PRYNNE LESS WITH HOPE THAN APPREHENSION
                                +
                                +
                                +2022-10-13 19:09:06,912 INFO [pretrained.py:390] Decoding Done
                                +
                                +
                                +
                                +
                                +

                                Use the exported model to run decode.py

                                +

                                When we publish the model, we always note down its WERs on some test +dataset in RESULTS.md. This section describes how to use the +pretrained model to reproduce the WER.

                                +
                                cd egs/librispeech/ASR
                                +git lfs install
                                +git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13
                                +
                                +cd icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp
                                +ln -s pretrained-iter-1224000-avg-14.pt epoch-9999.pt
                                +cd ../..
                                +
                                +
                                +

                                We create a symlink with name epoch-9999.pt to pretrained-iter-1224000-avg-14.pt, +so that we can pass --epoch 9999 --avg 1 to decode.py in the following +command:

                                +
                                ./pruned_transducer_stateless3/decode.py \
                                +    --epoch 9999 \
                                +    --avg 1 \
                                +    --exp-dir ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp \
                                +    --lang-dir ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500 \
                                +    --max-duration 600 \
                                +    --decoding-method greedy_search
                                +
                                +
                                +

                                You will find the decoding results in +./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/greedy_search.

                                +
                                +

                                Caution

                                +

                                For some recipes, you also need to pass --use-averaged-model False +to decode.py. The reason is that the exported pretrained model is already +the averaged one.

                                +
                                +
                                +

                                Hint

                                +

                                Before running decode.py, we assume that you have already run +prepare.sh to prepare the test dataset.

                                +
                                +
                                +
                                + + +
                                +
                                + +
                                +
                                +
                                +
                                + + + + \ No newline at end of file diff --git a/model-export/export-ncnn.html b/model-export/export-ncnn.html new file mode 100644 index 000000000..2338e1df1 --- /dev/null +++ b/model-export/export-ncnn.html @@ -0,0 +1,125 @@ + + + + + + + Export to ncnn — icefall 0.1 documentation + + + + + + + + + + + + + + + + + + +
                                + + +
                                + +
                                +
                                +
                                + +
                                +
                                +
                                +
                                + +
                                +

                                Export to ncnn

                                +

                                We support exporting LSTM transducer models to ncnn.

                                +

                                Please refer to Export model for ncnn for details.

                                +

                                We also provide https://github.com/k2-fsa/sherpa-ncnn +performing speech recognition using ncnn with exported models. +It has been tested on Linux, macOS, Windows, and Raspberry Pi. The project is +self-contained and can be statically linked to produce a binary containing +everything needed.

                                +
                                + + +
                                +
                                + +
                                +
                                +
                                +
                                + + + + \ No newline at end of file diff --git a/model-export/export-onnx.html b/model-export/export-onnx.html new file mode 100644 index 000000000..0c316d7d0 --- /dev/null +++ b/model-export/export-onnx.html @@ -0,0 +1,182 @@ + + + + + + + Export to ONNX — icefall 0.1 documentation + + + + + + + + + + + + + + + + + + +
                                + + +
                                + +
                                +
                                +
                                + +
                                +
                                +
                                +
                                + +
                                +

                                Export to ONNX

                                +

                                In this section, we describe how to export models to ONNX.

                                +
                                +

                                Hint

                                +

                                Only non-streaming conformer transducer models are tested.

                                +
                                +
                                +

                                When to use it

                                +

                                It you want to use an inference framework that supports ONNX +to run the pretrained model.

                                +
                                +
                                +

                                How to export

                                +

                                We use +https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless3 +as an example in the following.

                                +
                                cd egs/librispeech/ASR
                                +epoch=14
                                +avg=2
                                +
                                +./pruned_transducer_stateless3/export.py \
                                +  --exp-dir ./pruned_transducer_stateless3/exp \
                                +  --bpe-model data/lang_bpe_500/bpe.model \
                                +  --epoch $epoch \
                                +  --avg $avg \
                                +  --onnx 1
                                +
                                +
                                +

                                It will generate the following files inside pruned_transducer_stateless3/exp:

                                +
                                +
                                  +
                                • encoder.onnx

                                • +
                                • decoder.onnx

                                • +
                                • joiner.onnx

                                • +
                                • joiner_encoder_proj.onnx

                                • +
                                • joiner_decoder_proj.onnx

                                • +
                                +
                                +

                                You can use ./pruned_transducer_stateless3/exp/onnx_pretrained.py to decode +waves with the generated files:

                                +
                                ./pruned_transducer_stateless3/onnx_pretrained.py \
                                +  --bpe-model ./data/lang_bpe_500/bpe.model \
                                +  --encoder-model-filename ./pruned_transducer_stateless3/exp/encoder.onnx \
                                +  --decoder-model-filename ./pruned_transducer_stateless3/exp/decoder.onnx \
                                +  --joiner-model-filename ./pruned_transducer_stateless3/exp/joiner.onnx \
                                +  --joiner-encoder-proj-model-filename ./pruned_transducer_stateless3/exp/joiner_encoder_proj.onnx \
                                +  --joiner-decoder-proj-model-filename ./pruned_transducer_stateless3/exp/joiner_decoder_proj.onnx \
                                +  /path/to/foo.wav \
                                +  /path/to/bar.wav \
                                +  /path/to/baz.wav
                                +
                                +
                                +
                                +
                                +

                                How to use the exported model

                                +

                                We also provide https://github.com/k2-fsa/sherpa-onnx +performing speech recognition using onnxruntime +with exported models. +It has been tested on Linux, macOS, and Windows.

                                +
                                +
                                + + +
                                +
                                + +
                                +
                                +
                                +
                                + + + + \ No newline at end of file diff --git a/model-export/export-with-torch-jit-script.html b/model-export/export-with-torch-jit-script.html new file mode 100644 index 000000000..2d0771d0e --- /dev/null +++ b/model-export/export-with-torch-jit-script.html @@ -0,0 +1,172 @@ + + + + + + + Export model with torch.jit.script() — icefall 0.1 documentation + + + + + + + + + + + + + + + + + + +
                                + + +
                                + +
                                +
                                +
                                + +
                                +
                                +
                                +
                                + +
                                +

                                Export model with torch.jit.script()

                                +

                                In this section, we describe how to export a model via +torch.jit.script().

                                +
                                +

                                When to use it

                                +

                                If we want to use our trained model with torchscript, +we can use torch.jit.script().

                                +
                                +

                                Hint

                                +

                                See Export model with torch.jit.trace() +if you want to use torch.jit.trace().

                                +
                                +
                                +
                                +

                                How to export

                                +

                                We use +https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless3 +as an example in the following.

                                +
                                cd egs/librispeech/ASR
                                +epoch=14
                                +avg=1
                                +
                                +./pruned_transducer_stateless3/export.py \
                                +  --exp-dir ./pruned_transducer_stateless3/exp \
                                +  --bpe-model data/lang_bpe_500/bpe.model \
                                +  --epoch $epoch \
                                +  --avg $avg \
                                +  --jit 1
                                +
                                +
                                +

                                It will generate a file cpu_jit.pt in pruned_transducer_stateless3/exp.

                                +
                                +

                                Caution

                                +

                                Don’t be confused by cpu in cpu_jit.pt. We move all parameters +to CPU before saving it into a pt file; that’s why we use cpu +in the filename.

                                +
                                +
                                +
                                +

                                How to use the exported model

                                +

                                Please refer to the following pages for usage:

                                + +
                                +
                                + + +
                                +
                                + +
                                +
                                +
                                +
                                + + + + \ No newline at end of file diff --git a/model-export/export-with-torch-jit-trace.html b/model-export/export-with-torch-jit-trace.html new file mode 100644 index 000000000..f67705231 --- /dev/null +++ b/model-export/export-with-torch-jit-trace.html @@ -0,0 +1,183 @@ + + + + + + + Export model with torch.jit.trace() — icefall 0.1 documentation + + + + + + + + + + + + + + + + + + +
                                + + +
                                + +
                                +
                                +
                                + +
                                +
                                +
                                +
                                + +
                                +

                                Export model with torch.jit.trace()

                                +

                                In this section, we describe how to export a model via +torch.jit.trace().

                                +
                                +

                                When to use it

                                +

                                If we want to use our trained model with torchscript, +we can use torch.jit.trace().

                                +
                                +

                                Hint

                                +

                                See Export model with torch.jit.script() +if you want to use torch.jit.script().

                                +
                                +
                                +
                                +

                                How to export

                                +

                                We use +https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/lstm_transducer_stateless2 +as an example in the following.

                                +
                                iter=468000
                                +avg=16
                                +
                                +cd egs/librispeech/ASR
                                +
                                +./lstm_transducer_stateless2/export.py \
                                +  --exp-dir ./lstm_transducer_stateless2/exp \
                                +  --bpe-model data/lang_bpe_500/bpe.model \
                                +  --iter $iter \
                                +  --avg  $avg \
                                +  --jit-trace 1
                                +
                                +
                                +

                                It will generate three files inside lstm_transducer_stateless2/exp:

                                +
                                +
                                  +
                                • encoder_jit_trace.pt

                                • +
                                • decoder_jit_trace.pt

                                • +
                                • joiner_jit_trace.pt

                                • +
                                +
                                +

                                You can use +https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/lstm_transducer_stateless2/jit_pretrained.py +to decode sound files with the following commands:

                                +
                                cd egs/librispeech/ASR
                                +./lstm_transducer_stateless2/jit_pretrained.py \
                                +  --bpe-model ./data/lang_bpe_500/bpe.model \
                                +  --encoder-model-filename ./lstm_transducer_stateless2/exp/encoder_jit_trace.pt \
                                +  --decoder-model-filename ./lstm_transducer_stateless2/exp/decoder_jit_trace.pt \
                                +  --joiner-model-filename ./lstm_transducer_stateless2/exp/joiner_jit_trace.pt \
                                +  /path/to/foo.wav \
                                +  /path/to/bar.wav \
                                +  /path/to/baz.wav
                                +
                                +
                                +
                                +
                                +

                                How to use the exported models

                                +

                                Please refer to +https://k2-fsa.github.io/sherpa/python/streaming_asr/lstm/index.html +for its usage in sherpa. +You can also find pretrained models there.

                                +
                                +
                                + + +
                                +
                                + +
                                +
                                +
                                +
                                + + + + \ No newline at end of file diff --git a/model-export/index.html b/model-export/index.html new file mode 100644 index 000000000..eba8286ed --- /dev/null +++ b/model-export/index.html @@ -0,0 +1,148 @@ + + + + + + + Model export — icefall 0.1 documentation + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/objects.inv b/objects.inv index 622ff5f6a..fb4d813e5 100644 Binary files a/objects.inv and b/objects.inv differ diff --git a/recipes/aishell/conformer_ctc.html b/recipes/aishell/conformer_ctc.html index 459c14917..2a25508a4 100644 --- a/recipes/aishell/conformer_ctc.html +++ b/recipes/aishell/conformer_ctc.html @@ -42,6 +42,7 @@

                                Contents:

                                • Installation
                                • +
                                • Model export
                                • Recipes
                                  • aishell
                                    • TDNN-LSTM CTC
                                    • diff --git a/recipes/aishell/index.html b/recipes/aishell/index.html index 03ed8f3d8..2efefb985 100644 --- a/recipes/aishell/index.html +++ b/recipes/aishell/index.html @@ -42,6 +42,7 @@

                                      Contents:

                                      • Installation
                                      • +
                                      • Model export
                                      • Recipes
                                        • aishell
                                          • TDNN-LSTM CTC
                                          • diff --git a/recipes/aishell/stateless_transducer.html b/recipes/aishell/stateless_transducer.html index 6c014698d..2a440bb9d 100644 --- a/recipes/aishell/stateless_transducer.html +++ b/recipes/aishell/stateless_transducer.html @@ -42,6 +42,7 @@

                                            Contents:

                                            • Installation
                                            • +
                                            • Model export
                                            • Recipes
                                              • aishell
                                                • TDNN-LSTM CTC
                                                • diff --git a/recipes/aishell/tdnn_lstm_ctc.html b/recipes/aishell/tdnn_lstm_ctc.html index fd852f7cc..ed81f1a9c 100644 --- a/recipes/aishell/tdnn_lstm_ctc.html +++ b/recipes/aishell/tdnn_lstm_ctc.html @@ -42,6 +42,7 @@

                                                  Contents:

                                                  • Installation
                                                  • +
                                                  • Model export
                                                  • Recipes
                                                    • aishell
                                                      • TDNN-LSTM CTC
                                                          diff --git a/recipes/index.html b/recipes/index.html index 37f8b2a47..ce20c5987 100644 --- a/recipes/index.html +++ b/recipes/index.html @@ -21,7 +21,7 @@ - + @@ -42,6 +42,7 @@

                                                          Contents:

                                                          • Installation
                                                          • +
                                                          • Model export
                                                          • Recipes
                                                            • aishell
                                                            • LibriSpeech
                                                            • @@ -114,7 +115,7 @@ Currently, only speech recognition recipes are provided.