yesno ===== This page shows you how to run the ``yesno`` recipe. .. HINT:: We assume you have read the page :ref:`install icefall` and have setup the environment for ``icefall``. .. HINT:: You **don't** need a **GPU** to run this recipe. It can be run on a **CPU**. The training time takes less than 30 **seconds** and you will get the following WER:: [test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ] Data preparation ---------------- .. code-block:: bash $ cd egs/yesno/ASR $ ./prepare.sh The script ``./prepare.sh`` handles the data preparation for you, automagically. All you need to do is to run it. The data preparation contains several stages, you can use the following two options: - ``--stage`` - ``--stop-stage`` to control which stage(s) should be run. By default, all stages are executed. For example, .. code-block:: bash $ cd egs/yesno/ASR $ ./prepare.sh --stage 0 --stop-stage 0 means to run only stage 0. To run stage 2 to stage 5, use: .. code-block:: bash $ ./prepare.sh --stage 2 --stop-stage 5 Training -------- We provide only a TDNN model, contained in the `tdnn `_ folder, for ``yesno``. The command to run the training part is: .. code-block:: bash $ cd egs/yesno/ASR $ export CUDA_VISIBLE_DEVICES="" $ ./tdnn/train.py By default, it will run ``15`` epochs. Training logs and checkpoints are saved in ``tdnn/exp``. In ``tdnn/exp``, you will find the following files: - ``epoch-0.pt``, ``epoch-1.pt``, ... These are checkpoint files, containing model parameters and optimizer ``state_dict``. To resume training from some checkpoint, say ``epoch-10.pt``, you can use: .. code-block:: bash $ ./tdnn/train.py --start-epoch 11 - ``tensorboard/`` This folder contains TensorBoard logs. Training loss, validation loss, learning rate, etc, are recorded in these logs. You can visualize them by: .. code-block:: bash $ cd tdnn/exp/tensorboard $ tensorboard dev upload --logdir . --description "TDNN training for yesno with icefall" It will print something like below: .. code-block:: TensorFlow installation not found - running with reduced feature set. Upload started and will continue reading any new data as it's added to the logdir. To stop uploading, press Ctrl-C. New experiment created. View your TensorBoard at: https://tensorboard.dev/experiment/yKUbhb5wRmOSXYkId1z9eg/ [2021-08-23T23:49:41] Started scanning logdir. [2021-08-23T23:49:42] Total uploaded: 135 scalars, 0 tensors, 0 binary objects Listening for new data in logdir... Note there is a URL in the above output, click it and you will see the following screenshot: .. figure:: images/yesno-tdnn-tensorboard-log.png :width: 600 :alt: TensorBoard screenshot :align: center :target: https://tensorboard.dev/experiment/yKUbhb5wRmOSXYkId1z9eg/ TensorBoard screenshot. - ``log/log-train-xxxx`` It is the detailed training log in text format, same as the one you saw printed to the console during training. To see available training options, you can use: .. code-block:: bash $ ./tdnn/train.py --help .. NOTE:: By default, ``./tdnn/train.py`` uses GPU 0 for training if GPUs are available. If you have two GPUs, say, GPU 0 and GPU 1, and you want to use GPU 1 for training, you can run: .. code-block:: bash $ export CUDA_VISIBLE_DEVICES="1" $ ./tdnn/train.py Since the ``yesno`` dataset is very small, containing only 30 sound files for training, and the model in use is also very small, we use: .. code-block:: bash $ export CUDA_VISIBLE_DEVICES="" so that ``./tdnn/train.py`` uses CPU during training. If you don't have GPUs, then you don't need to run ``export CUDA_VISIBLE_DEVICES=""``. Decoding -------- The decoding part uses checkpoints saved by the training part, so you have to run the training part first. The command for decoding is: .. code-block:: bash $ export CUDA_VISIBLE_DEVICES="" $ ./tdnn/decode.py You will see the WER in the output log. Decoded results are saved in ``tdnn/exp``. Colab notebook -------------- We do provide a colab notebook for this recipe. |yesno colab notebook| .. |yesno colab notebook| image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing Use a pre-trained model ----------------------- TODO