mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-09-11 10:04:21 +00:00
11 lines
571 B
Markdown
11 lines
571 B
Markdown
# Introduction
|
|
|
|
This is a weakly supervised ASR recipe for the LibriSpeech (clean 100 hours) dataset. We training a
|
|
conformer model using BTC/OTC with transcripts with synthetic errors. In this README, we will describe
|
|
the task and the BTC/OTC training process.
|
|
|
|
## Task
|
|
We propose BTC/OTC to directly train an ASR system leveraging weak supervision, i.e., speech with non-verbatim transcripts.
|
|
This is achieved by using a special token to model uncertainties (i.e., substitution errors, insertion errors, and deletion errors)
|
|
within the WFST framework during training.
|