Fangjun Kuang 4890e27b45
Extract framewise alignment information using CTC decoding (#39)
* Use new APIs with k2.RaggedTensor

* Fix style issues.

* Update the installation doc, saying it requires at least k2 v1.7

* Extract framewise alignment information using CTC decoding.

* Print environment information.

Print information about k2, lhotse, PyTorch, and icefall.

* Fix CI.

* Fix CI.

* Compute framewise alignment information of the LibriSpeech dataset.

* Update comments for the time to compute alignments of train-960.

* Preserve cut id in mix cut transformer.

* Minor fixes.

* Add doc about how to extract framewise alignments.
2021-10-18 14:24:33 +08:00

1.5 KiB

Introduction

Please visit https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html for how to run this recipe.

How to compute framewise alignment information

Step 1: Train a model

Please use conformer_ctc/train.py to train a model. See https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html for how to do it.

Step 2: Compute framewise alignment

Run

# Choose a checkpoint and determine the number of checkpoints to average
epoch=30
avg=15
./conformer_ctc/ali.py \
  --epoch $epoch \
  --avg $avg \
  --max-duration 500 \
  --bucketing-sampler 0 \
  --full-libri 1 \
  --exp-dir conformer_ctc/exp \
  --lang-dir data/lang_bpe_5000 \
  --ali-dir data/ali_5000

and you will get four files inside the folder data/ali_5000:

$ ls -lh data/ali_500
total 546M
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:06 test_clean.pt
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:07 test_other.pt
-rw-r--r-- 1 kuangfangjun root 542M Sep 28 11:36 train-960.pt
-rw-r--r-- 1 kuangfangjun root 2.1M Sep 28 11:38 valid.pt

Note: It can take more than 3 hours to compute the alignment for the training dataset, which contains 960 * 3 = 2880 hours of data.

Caution: The model parameters in conformer_ctc/ali.py have to match those in conformer_ctc/train.py.

Caution: You have to set the parameter preserve_id to True for CutMix. Search ./conformer_ctc/asr_datamodule.py for preserve_id.

TODO: Add doc about how to use the extracted alignment in the other pull-request.