icefall/egs/voxpopuli/ASR/README.md
Karel Vesely 4ec48f30b1 add the voxpopuli recipe
- this is the data preparation
- there is no ASR training and no results
2023-11-07 15:03:23 +01:00

1.8 KiB

Readme

This recipe contains data preparation for the VoxPopuli dataset. At the moment, without model training.

audio per language

language Size Hrs. untranscribed Hrs. transcribed
bg 295G 17.6K -
cs 308G 18.7K 62
da 233G 13.6K -
de 379G 23.2K 282
el 305G 17.7K -
en 382G 24.1K 543
es 362G 21.4K 166
et 179G 10.6K 3
fi 236G 14.2K 27
fr 376G 22.8K 211
hr 132G 8.1K 43
hu 297G 17.7K 63
it 361G 21.9K 91
lt 243G 14.4K 2
lv 217G 13.1K -
mt 147G 9.1K -
nl 322G 19.0K 53
pl 348G 21.2K 111
pt 300G 17.5K -
ro 296G 17.9K 89
sk 201G 12.1K 35
sl 190G 11.3K 10
sv 272G 16.3K -
total 6.3T 384K 1791