Add more doc for the recipe yesno.

2025-08-13 20:12:24 +00:00 · 2021-08-24 00:00:47 +08:00 · 2021-08-24 00:00:47 +08:00 · 39554781b2
commit 39554781b2
parent dcf71b31a5
4 changed files with 112 additions and 6 deletions
--- a/docs/source/installation/index.rst
+++ b/docs/source/installation/index.rst
@ -310,7 +310,7 @@ The correct fix is:
  .6.1 tensorboard-plugin-wit-1.8.0 urllib3-1.26.6 werkzeug-2.0.1
-Test your Installation
+Test Your Installation
 ----------------------
 To test that your installation is successful, let us run
--- a/docs/source/recipes/images/yesno-tdnn-tensorboard-log.png
+++ b/docs/source/recipes/images/yesno-tdnn-tensorboard-log.png
--- a/docs/source/recipes/yesno.rst
+++ b/docs/source/recipes/yesno.rst
@ -19,7 +19,7 @@ This page shows you how to run the ``yesno`` recipe.
 Data preparation
 ----------------
-.. code-block::
+.. code-block:: bash
  $ cd egs/yesno/ASR
  $ ./prepare.sh
@ -64,17 +64,94 @@ The command to run the training part is:
 .. code-block:: bash
  $ cd egs/yesno/ASR
  $ export CUDA_VISIBLE_DEVICES=""
  $ ./tdnn/train.py
 By default, it will run ``15`` epochs. Training logs and checkpoints are saved
 in ``tdnn/exp``.
-To see the training options, you can use:
+In ``tdnn/exp``, you will find the following files:
  - ``epoch-0.pt``, ``epoch-1.pt``, ...
    These are checkpoint files, containing model parameters and optimizer ``state_dict``.
    To resume training from some checkpoint, say ``epoch-10.pt``, you can use:
      .. code-block:: bash
        $ ./tdnn/train.py --start-epoch 11
  - ``tensorboard/``
    This folder contains TensorBoard logs. Training loss, validation loss, learning
    rate, etc, are recorded in these logs. You can visualize them by:
      .. code-block:: bash
        $ cd tdnn/exp/tensorboard
        $ tensorboard dev upload --logdir . --description "TDNN training for yesno with icefall"
    It will print something like below:
      .. code-block::
        TensorFlow installation not found - running with reduced feature set.
        Upload started and will continue reading any new data as it's added to the logdir.
        To stop uploading, press Ctrl-C.
        New experiment created. View your TensorBoard at: https://tensorboard.dev/experiment/yKUbhb5wRmOSXYkId1z9eg/
        [2021-08-23T23:49:41] Started scanning logdir.
        [2021-08-23T23:49:42] Total uploaded: 135 scalars, 0 tensors, 0 binary objects
        Listening for new data in logdir...
    Note there is a URL in the above output, click it and you will see
    the following screenshot:
      .. figure:: images/yesno-tdnn-tensorboard-log.png
         :width: 600
         :alt: TensorBoard screenshot
         :align: center
         :target: https://tensorboard.dev/experiment/yKUbhb5wRmOSXYkId1z9eg/
         TensorBoard screenshot.
  - ``log/log-train-xxxx``
    It is the detailed training log in text format, same as the one
    you saw printed to the console during training.
 To see available training options, you can use:
 .. code-block:: bash
  $ ./tdnn/train.py --help
 .. NOTE::
  By default, ``./tdnn/train.py`` uses GPU 0 for training if GPUs are available.
  If you have two GPUs, say, GPU 0 and GPU 1, and you want to use GPU 1 for
  training, you can run:
    .. code-block:: bash
      $ export CUDA_VISIBLE_DEVICES="1"
      $ ./tdnn/train.py
  Since the ``yesno`` dataset is very small, containing only 30 sound files
  for training, and the model in use is also very small, we use:
    .. code-block:: bash
      $ export CUDA_VISIBLE_DEVICES=""
  so that ``./tdnn/train.py`` uses CPU during training.
  If you don't have GPUs, then you don't need to
  run ``export CUDA_VISIBLE_DEVICES=""``.
 Decoding
 --------
@ -85,10 +162,12 @@ The command for decoding is:
 .. code-block:: bash
  $ export CUDA_VISIBLE_DEVICES=""
  $ ./tdnn/decode.py
 You will see the WER in the output log.
-Decoding results are saved in ``tdnn/exp``.
+
 Decoded results are saved in ``tdnn/exp``.
 Colab notebook
 --------------
--- a/egs/yesno/ASR/tdnn/train.py
+++ b/egs/yesno/ASR/tdnn/train.py
@ -60,6 +60,16 @@ def get_parser():
        help="Number of epochs to train.",
    )
    parser.add_argument(
        "--start-epoch",
        type=int,
        default=0,
        help="""Resume training from from this epoch.
        If it is positive, it will load checkpoint from
        tdnn/exp/epoch-{start_epoch-1}.pt
        """,
    )
    return parser
@ -92,8 +102,6 @@ def get_params() -> AttributeDict:
        - start_epoch:  If it is not zero, load checkpoint `start_epoch-1`
                        and continue training from that checkpoint.
        - num_epochs:  Number of epochs to train.
        - best_train_loss: Best training loss so far. It is used to select
                           the model that has the lowest training loss. It is
                           updated during the training.
@ -420,6 +428,19 @@ def train_one_epoch(
                f"batch size: {batch_size}"
            )
            if tb_writer is not None:
                tb_writer.add_scalar(
                    "train/current_loss",
                    loss_cpu / params.train_frames,
                    params.batch_idx_train,
                )
                tb_writer.add_scalar(
                    "train/tot_avg_loss",
                    tot_avg_loss,
                    params.batch_idx_train,
                )
        if batch_idx > 0 and batch_idx % params.valid_interval == 0:
            compute_validation_loss(
                params=params,
@ -434,6 +455,12 @@ def train_one_epoch(
                f" best valid loss: {params.best_valid_loss:.4f} "
                f"best valid epoch: {params.best_valid_epoch}"
            )
            if tb_writer is not None:
                tb_writer.add_scalar(
                    "train/valid_loss",
                    params.valid_loss,
                    params.batch_idx_train,
                )
    params.train_loss = tot_loss / tot_frames