mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-09 01:52:41 +00:00
Small fixes to the transducer training doc (#575)
This commit is contained in:
parent
099cd3a215
commit
9ae2f3a3c5
@ -1,5 +1,5 @@
|
|||||||
Transducer
|
LSTM Transducer
|
||||||
==========
|
===============
|
||||||
|
|
||||||
.. hint::
|
.. hint::
|
||||||
|
|
||||||
@ -7,7 +7,7 @@ Transducer
|
|||||||
for pretrained models if you don't want to train a model from scratch.
|
for pretrained models if you don't want to train a model from scratch.
|
||||||
|
|
||||||
|
|
||||||
This tutorial shows you how to train a transducer model
|
This tutorial shows you how to train an LSTM transducer model
|
||||||
with the `LibriSpeech <https://www.openslr.org/12>`_ dataset.
|
with the `LibriSpeech <https://www.openslr.org/12>`_ dataset.
|
||||||
|
|
||||||
We use pruned RNN-T to compute the loss.
|
We use pruned RNN-T to compute the loss.
|
||||||
@ -20,9 +20,9 @@ We use pruned RNN-T to compute the loss.
|
|||||||
|
|
||||||
The transducer model consists of 3 parts:
|
The transducer model consists of 3 parts:
|
||||||
|
|
||||||
- Encoder, a.k.a, transcriber. We use an LSTM model
|
- Encoder, a.k.a, the transcription network. We use an LSTM model
|
||||||
- Decoder, a.k.a, predictor. We use a model consisting of ``nn.Embedding``
|
- Decoder, a.k.a, the prediction network. We use a stateless model consisting of
|
||||||
and ``nn.Conv1d``
|
``nn.Embedding`` and ``nn.Conv1d``
|
||||||
- Joiner, a.k.a, the joint network.
|
- Joiner, a.k.a, the joint network.
|
||||||
|
|
||||||
.. caution::
|
.. caution::
|
||||||
@ -74,7 +74,11 @@ Data preparation
|
|||||||
The script ``./prepare.sh`` handles the data preparation for you, **automagically**.
|
The script ``./prepare.sh`` handles the data preparation for you, **automagically**.
|
||||||
All you need to do is to run it.
|
All you need to do is to run it.
|
||||||
|
|
||||||
The data preparation contains several stages, you can use the following two
|
.. note::
|
||||||
|
|
||||||
|
We encourage you to read ``./prepare.sh``.
|
||||||
|
|
||||||
|
The data preparation contains several stages. You can use the following two
|
||||||
options:
|
options:
|
||||||
|
|
||||||
- ``--stage``
|
- ``--stage``
|
||||||
@ -263,7 +267,7 @@ You will find the following files in that directory:
|
|||||||
|
|
||||||
- ``tensorboard/``
|
- ``tensorboard/``
|
||||||
|
|
||||||
This folder contains TensorBoard logs. Training loss, validation loss, learning
|
This folder contains tensorBoard logs. Training loss, validation loss, learning
|
||||||
rate, etc, are recorded in these logs. You can visualize them by:
|
rate, etc, are recorded in these logs. You can visualize them by:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
@ -287,7 +291,7 @@ You will find the following files in that directory:
|
|||||||
[2022-09-20T15:53:02] Total uploaded: 210171 scalars, 0 tensors, 0 binary objects
|
[2022-09-20T15:53:02] Total uploaded: 210171 scalars, 0 tensors, 0 binary objects
|
||||||
Listening for new data in logdir...
|
Listening for new data in logdir...
|
||||||
|
|
||||||
Note there is a URL in the above output, click it and you will see
|
Note there is a URL in the above output. Click it and you will see
|
||||||
the following screenshot:
|
the following screenshot:
|
||||||
|
|
||||||
.. figure:: images/librispeech-lstm-transducer-tensorboard-log.png
|
.. figure:: images/librispeech-lstm-transducer-tensorboard-log.png
|
||||||
@ -422,7 +426,7 @@ The following shows two examples:
|
|||||||
Export models
|
Export models
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
`lstm_transducer_stateless2/export.py <https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/lstm_transducer_stateless2/export.py>`_ supports to export checkpoints from ``lstm_transducer_stateless2/exp`` in the following ways.
|
`lstm_transducer_stateless2/export.py <https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/lstm_transducer_stateless2/export.py>`_ supports exporting checkpoints from ``lstm_transducer_stateless2/exp`` in the following ways.
|
||||||
|
|
||||||
Export ``model.state_dict()``
|
Export ``model.state_dict()``
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
@ -458,7 +462,7 @@ It will generate a file ``./lstm_transducer_stateless2/exp/pretrained.pt``.
|
|||||||
cd lstm_transducer_stateless2/exp
|
cd lstm_transducer_stateless2/exp
|
||||||
ln -s pretrained epoch-9999.pt
|
ln -s pretrained epoch-9999.pt
|
||||||
|
|
||||||
And then pass `--epoch 9999 --avg 1 --use-averaged-model 0` to
|
And then pass ``--epoch 9999 --avg 1 --use-averaged-model 0`` to
|
||||||
``./lstm_transducer_stateless2/decode.py``.
|
``./lstm_transducer_stateless2/decode.py``.
|
||||||
|
|
||||||
To use the exported model with ``./lstm_transducer_stateless2/pretrained.py``, you
|
To use the exported model with ``./lstm_transducer_stateless2/pretrained.py``, you
|
||||||
@ -506,6 +510,11 @@ To use the generated files with ``./lstm_transducer_stateless2/jit_pretrained``:
|
|||||||
/path/to/foo.wav \
|
/path/to/foo.wav \
|
||||||
/path/to/bar.wav
|
/path/to/bar.wav
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
Please see `<https://k2-fsa.github.io/sherpa/python/streaming_asr/lstm/english/server.html>`_
|
||||||
|
for how to use the exported models in ``sherpa``.
|
||||||
|
|
||||||
Export model for ncnn
|
Export model for ncnn
|
||||||
~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
@ -576,7 +585,7 @@ It will generate the following files:
|
|||||||
- ``./lstm_transducer_stateless2/exp/joiner_jit_trace-pnnx.ncnn.param``
|
- ``./lstm_transducer_stateless2/exp/joiner_jit_trace-pnnx.ncnn.param``
|
||||||
- ``./lstm_transducer_stateless2/exp/joiner_jit_trace-pnnx.ncnn.bin``
|
- ``./lstm_transducer_stateless2/exp/joiner_jit_trace-pnnx.ncnn.bin``
|
||||||
|
|
||||||
To use the above generate files, run:
|
To use the above generated files, run:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
@ -605,8 +614,8 @@ To use the above generate files, run:
|
|||||||
To use the above generated files in C++, please see
|
To use the above generated files in C++, please see
|
||||||
`<https://github.com/k2-fsa/sherpa-ncnn>`_
|
`<https://github.com/k2-fsa/sherpa-ncnn>`_
|
||||||
|
|
||||||
It is able to generate a static linked library that can be run on Linux, Windows,
|
It is able to generate a static linked executable that can be run on Linux, Windows,
|
||||||
macOS, Raspberry Pi, etc.
|
macOS, Raspberry Pi, etc, without external dependencies.
|
||||||
|
|
||||||
Download pretrained models
|
Download pretrained models
|
||||||
--------------------------
|
--------------------------
|
||||||
|
Loading…
x
Reference in New Issue
Block a user