icefall/egs/multi_ja_en/ASR/README.md

# Introduction

A bilingual Japanese-English ASR model that utilizes ReazonSpeech, developed by the developers of ReazonSpeech.

**ReazonSpeech** is an open-source dataset that contains a diverse set of natural Japanese speech, collected from terrestrial television streams. It contains more than 35,000 hours of audio.


# Included Training Sets

1. LibriSpeech (English)
2. ReazonSpeech (Japanese)

|Datset| Number of hours| URL|
|---|---:|---|
|**TOTAL**|35,960|---|
|LibriSpeech|960|https://www.openslr.org/12/|
|ReazonSpeech (all) |35,000|https://huggingface.co/datasets/reazon-research/reazonspeech|