27 Commits

Author SHA1 Message Date
Yuekai Zhang
6d7c1d13a5
update speechio whisper ft results (#1605)
* update speechio whisper ft results
2024-04-30 11:49:20 +08:00
Yuekai Zhang
5df24c1685
Whisper large fine-tuning on wenetspeech, mutli-hans-zh (#1483)
* add whisper fbank for wenetspeech

* add whisper fbank for other dataset

* add str to bool

* add decode for wenetspeech

* add requirments.txt

* add original model decode with 30s

* test feature extractor speed

* add aishell2 feat

* change compute feature batch

* fix overwrite

* fix executor

* regression

* add kaldifeatwhisper fbank

* fix io issue

* parallel jobs

* use multi machines

* add wenetspeech fine-tune scripts

* add monkey patch codes

* remove useless file

* fix subsampling factor

* fix too long audios

* add remove long short

* fix whisper version to support multi batch beam

* decode all wav files

* remove utterance more than 30s in test_net

* only test net

* using soft links

* add kespeech whisper feats

* fix index error

* add manifests for whisper

* change to licomchunky writer

* add missing option

* decrease cpu usage 

* add speed perturb for kespeech

* fix kespeech speed perturb

* add dataset

* load checkpoint from specific path

* add speechio

* add speechio results

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2024-03-07 19:04:27 +08:00
zr_jin
242002e0bd
Strengthened style constraints (#1527) 2024-03-04 23:28:04 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting (#1428)
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
zr_jin
9c494a3329
typos fixed (#1472) 2024-01-25 18:41:43 +08:00
zr_jin
9ef8145fa3
minor fixes (#1240) 2023-09-04 17:56:05 +08:00
zr_jin
74806b744b
disable speed perturbation by default (#1176)
* disable speed perturbation by default

* minor fixes

* minor updates

* updated bash scripts to incorporate with the `speed-perturb` arg

* minor fixes

1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe

>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)

2. changed arg type for `perturb-speed` to str2bool
2023-08-10 20:56:02 +08:00
marcoyang1998
5ed6fc0e6d
add sym link (#1170) 2023-07-12 15:37:14 +08:00
marcoyang1998
d337398d29
Shallow fusion for Aishell (#954)
* add shallow fusion and LODR for aishell

* update RESULTS

* add save by iterations
2023-04-03 16:20:29 +08:00
Wei Kang
d74822d07b
Fix wenetspeech decoding speed (#953) 2023-03-21 21:35:32 +08:00
Fangjun Kuang
bd7fa2253d
Update the manifest statistics of the L subset of wenetspeech (#731) 2022-12-04 20:27:45 +08:00
Desh Raj
107df3b115 apply black on all files 2022-11-17 09:42:17 -05:00
Fangjun Kuang
60317120ca
Revert "Apply new Black style changes" 2022-11-17 20:19:32 +08:00
Desh Raj
d110b04ad3 apply new black formatting to all files 2022-11-16 13:06:43 -05:00
Fangjun Kuang
e18fa78c3a
Check that read_manifests_if_cached returns a non-empty dict. (#555) 2022-08-28 11:50:11 +08:00
Fangjun Kuang
d68b8e9120
Disable CUDA_LAUNCH_BLOCKING in wenetspeech recipes. (#554)
* Disable CUDA_LAUNCH_BLOCKING in wenetspeech recipes.

* minor fixes
2022-08-28 11:17:38 +08:00
Weiji Zhuang
36eacaccb2
Fix preparing char based lang and add multiprocessing for wenetspeech text segmentation (#513)
* add multiprocessing for wenetspeech text segmentation

* Fix preparing char based lang for wenetspeech

* fix style

Co-authored-by: WeijiZhuang <zhuangweiji@xiaomi.com>
2022-08-03 19:19:40 +08:00
Mingshuang Luo
1b478d3ac3
Add other decoding methods (nbest, nbest oracle, nbest LG) for wenetspeech pruned rnnt2 (#482)
* add other decoding methods for wenetspeech

* changes for RESULTS.md

* add ngram-lm-scale=0.35 results

* set ngram-lm-scale=0.35 as default

* Update README.md

* add nbest-scale for flie name
2022-07-29 12:03:08 +08:00
Fangjun Kuang
ec69967584
Set overwrite=True when extracting features in batches. (#487) 2022-07-29 11:17:19 +08:00
Yuekai Zhang
c17233eca7
[Ready] [Recipes] add aishell2 (#465)
* add aishell2

* fix aishell2

* add manifest stats

* update prepare char dict

* fix lint

* setting max duration

* lint

* change context size to 1

* update result

* update hf link

* fix decoding comment

* add more decoding methods

* update result

* change context-size 2 default
2022-07-14 14:46:56 +08:00
Mingshuang Luo
8e0b7ea518
mv split cuts before computing feature (#461) 2022-07-04 11:59:37 +08:00
Mingshuang Luo
10e8bc5b56
do a change (#460) 2022-07-03 19:35:01 +08:00
Fangjun Kuang
ed66877694
Replace ChunkedLilcomHdf5Writer with LilcomChunkyWriter. (#411) 2022-06-09 11:18:52 +08:00
Mingshuang Luo
5079d99ee2
a correction for text2segmentation.py (#407) 2022-06-08 12:06:57 +08:00
Fangjun Kuang
f1abce72f8
Use jsonl for CutSet in the LibriSpeech recipe. (#397)
* Use jsonl for cutsets in the librispeech recipe.

* Use lazy cutset for all recipes.

* More fixes to use lazy CutSet.

* Remove force=True from logging to support Python < 3.8

* Minor fixes.

* Fix style issues.
2022-06-06 10:19:16 +08:00
Ewald Enzinger
8c5722de8c
[egs] Add prefix when reading manifests due to recent lhotse changes (#382)
* [egs] Add prefix when reading manifests due to recent lhotse changes

* Fix wenetspeech

* Fix style issues
2022-05-23 23:37:35 +08:00
Mingshuang Luo
0e57b30495
[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) (#349)
* add char-based pruned-rnnt2 for wenetspeech

* style check

* style check

* change for export.py

* do some changes

* do some changes

* a small change for .flake8

* solve the conflicts
2022-05-23 17:13:01 +08:00