This repository has been archived on 2026-03-23. You can view files and clone it, but cannot push or open issues or pull requests.

Introduction

A bilingual Japanese-English ASR model that utilizes ReazonSpeech, developed by the developers of ReazonSpeech.

ReazonSpeech is an open-source dataset that contains a diverse set of natural Japanese speech, collected from terrestrial television streams. It contains more than 35,000 hours of audio.

Included Training Sets

  1. LibriSpeech (English)
  2. ReazonSpeech (Japanese)
Datset Number of hours URL
TOTAL 35,960 ---
LibriSpeech 960 https://www.openslr.org/12/
ReazonSpeech (all) 35,000 https://huggingface.co/datasets/reazon-research/reazonspeech