mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-26 10:16:14 +00:00
8 lines
358 B
Markdown
8 lines
358 B
Markdown
# Introduction
|
|
|
|
Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech paired with
|
|
transcripts, collected from Cantonese audiobooks from Hong Kong. It comprises philosophy,
|
|
politics, education, culture, lifestyle and family domains, covering a wide range of topics.
|
|
|
|
Manuscript can be found at: https://arxiv.org/abs/2201.02419
|