mirror of https://github.com/k2-fsa/icefall.git synced 2025-08-08 09:32:20 +00:00

History

Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483 )

* add whisper fbank for wenetspeech

* add whisper fbank for other dataset

* add str to bool

* add decode for wenetspeech

* add requirments.txt

* add original model decode with 30s

* test feature extractor speed

* add aishell2 feat

* change compute feature batch

* fix overwrite

* fix executor

* regression

* add kaldifeatwhisper fbank

* fix io issue

* parallel jobs

* use multi machines

* add wenetspeech fine-tune scripts

* add monkey patch codes

* remove useless file

* fix subsampling factor

* fix too long audios

* add remove long short

* fix whisper version to support multi batch beam

* decode all wav files

* remove utterance more than 30s in test_net

* only test net

* using soft links

* add kespeech whisper feats

* fix index error

* add manifests for whisper

* change to licomchunky writer

* add missing option

* decrease cpu usage 

* add speed perturb for kespeech

* fix kespeech speed perturb

* add dataset

* load checkpoint from specific path

* add speechio

* add speechio results

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>

2024-03-07 19:04:27 +08:00

local

Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483 )

2024-03-07 19:04:27 +08:00

pruned_transducer_stateless2

fixed a CI test for wenetspeech (#1476 )

2024-01-27 06:41:56 +08:00

pruned_transducer_stateless5

Strengthened style constraints (#1527 )

2024-03-04 23:28:04 +08:00

whisper

Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483 )

2024-03-07 19:04:27 +08:00

zipformer

streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448 )

2024-01-05 10:21:27 +08:00

finetune.sh

Add finetuning script for aishell (#974 )

2023-03-30 17:08:46 +08:00

prepare.sh

Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483 )

2024-03-07 19:04:27 +08:00

README.md

[WIP] Pruned-transducer-stateless5-for-WenetSpeech (offline and streaming) (#447 )

2022-07-28 12:54:27 +08:00

RESULTS.md

zipformer wenetspeech (#1130 )

2023-06-26 09:33:18 +08:00

shared

[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) (#349 )

2022-05-23 17:13:01 +08:00

README.md

Introduction

This recipe includes some different ASR models trained with WenetSpeech.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless5`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.