* Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Extract framewise alignment information using CTC decoding. * Print environment information. Print information about k2, lhotse, PyTorch, and icefall. * Fix CI. * Fix CI. * Compute framewise alignment information of the LibriSpeech dataset. * Update comments for the time to compute alignments of train-960. * Preserve cut id in mix cut transformer. * Minor fixes. * Add doc about how to extract framewise alignments.
1.5 KiB
Introduction
Please visit https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html for how to run this recipe.
How to compute framewise alignment information
Step 1: Train a model
Please use conformer_ctc/train.py
to train a model.
See https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html
for how to do it.
Step 2: Compute framewise alignment
Run
# Choose a checkpoint and determine the number of checkpoints to average
epoch=30
avg=15
./conformer_ctc/ali.py \
--epoch $epoch \
--avg $avg \
--max-duration 500 \
--bucketing-sampler 0 \
--full-libri 1 \
--exp-dir conformer_ctc/exp \
--lang-dir data/lang_bpe_5000 \
--ali-dir data/ali_5000
and you will get four files inside the folder data/ali_5000
:
$ ls -lh data/ali_500
total 546M
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:06 test_clean.pt
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:07 test_other.pt
-rw-r--r-- 1 kuangfangjun root 542M Sep 28 11:36 train-960.pt
-rw-r--r-- 1 kuangfangjun root 2.1M Sep 28 11:38 valid.pt
Note: It can take more than 3 hours to compute the alignment for the training dataset, which contains 960 * 3 = 2880 hours of data.
Caution: The model parameters in conformer_ctc/ali.py
have to match those
in conformer_ctc/train.py
.
Caution: You have to set the parameter preserve_id
to True
for CutMix
.
Search ./conformer_ctc/asr_datamodule.py
for preserve_id
.
TODO: Add doc about how to use the extracted alignment in the other pull-request.