mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

History

disable speed perturbation by default (#1176 )

* disable speed perturbation by default

* minor fixes

* minor updates

* updated bash scripts to incorporate with the `speed-perturb` arg

* minor fixes

1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe

>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)

2. changed arg type for `perturb-speed` to str2bool

2023-08-10 20:56:02 +08:00

local

disable speed perturbation by default (#1176 )

2023-08-10 20:56:02 +08:00

pruned_transducer_stateless7

Remove cur_batch_idx (#1102 )

2023-05-30 14:49:54 +08:00

prepare.sh

disable speed perturbation by default (#1176 )

2023-08-10 20:56:02 +08:00

README.md

Add AliMeeting multi-condition training recipe (#751 )

2022-12-10 18:15:23 +08:00

RESULTS.md

Add AliMeeting multi-condition training recipe (#751 )

2022-12-10 18:15:23 +08:00

shared

Add AliMeeting multi-condition training recipe (#751 )

2022-12-10 18:15:23 +08:00

README.md

Introduction

This recipe trains multi-domain ASR models for AliMeeting. By multi-domain, we mean that we train a single model on close-talk and far-field conditions. This recipe optionally uses [GSS]-based enhancement for far-field array microphone. We pool data in the following 4 ways and train a single model on the pooled data:

(i) individual headset microphone (IHM) (ii) IHM with simulated reverb (iii) Single distant microphone (SDM) (iv) GSS-enhanced array microphones

This is different from alimeeting/ASR since that recipe trains a model only on the far-field audio. Additionally, we use text normalization here similar to the original M2MeT challenge, so the results should be more comparable to those from Table 4 of the paper.

The following additional packages need to be installed to run this recipe:

pip install jieba
pip install paddlepaddle
pip install git+https://github.com/desh2608/gss.git

./RESULTS.md contains the latest results.

Performance Record

pruned_transducer_stateless7

The following are decoded using modified_beam_search:

Evaluation set	eval WER	test WER
IHM	9.58	11.53
SDM	23.37	25.85
MDM (GSS-enhanced)	11.82	14.22

See RESULTS for details.