1114 Commits

Author SHA1 Message Date
root
b26d3fa596 add logging 2024-06-14 12:08:50 +08:00
root
4ebccebcc0 removing debug log 2024-06-14 12:08:50 +08:00
root
271536248f fix decoding issue and padding to longest 2024-06-14 12:08:50 +08:00
root
eb2c255e1e remove position ids 2024-06-14 12:08:50 +08:00
root
639feab4df update dataset with aishell 2 2024-06-14 12:08:50 +08:00
root
8afb0d647f fix template 2024-06-14 12:08:50 +08:00
Yuekai Zhang
16f18080be update prompt for decoding 2024-06-14 12:08:50 +08:00
Yuekai Zhang
40e4ac480c change prompt 2024-06-14 12:08:50 +08:00
Yuekai Zhang
68b99f456f fix debug 2024-06-14 12:08:50 +08:00
Yuekai Zhang
8bbd06112a add decode log 2024-06-14 12:08:50 +08:00
root
412e926941 fix down sample method 2024-06-14 12:08:50 +08:00
root
796663066f mask unrelated labels 2024-06-14 12:08:50 +08:00
Yuekai Zhang
3ac27d5ad4 fix requirements 2024-06-14 12:08:26 +08:00
Yuekai Zhang
09ec0d6553 add requirements.txt 2024-06-14 12:08:26 +08:00
Yuekai Zhang
19b5b86f9b fix decoding issues 2024-06-14 12:08:26 +08:00
Yuekai Zhang
3dbbc29429 add decode file 2024-06-14 12:08:26 +08:00
Yuekai Zhang
b5a906cbbd fix bugs 2024-06-14 12:08:26 +08:00
Yuekai Zhang
e495c9d732 add whisper llm 2024-06-14 12:08:26 +08:00
Triplecq
3b40d9bbb1
Zipformer recipe for ReazonSpeech (#1611)
* Add first cut at ReazonSpeech recipe

This recipe is mostly based on egs/csj, but tweaked to the point that
can be run with ReazonSpeech corpus.

Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>

---------

Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
Co-authored-by: Fujimoto Seiji <fujimoto@ceptord.net>
Co-authored-by: Chen <qc@KDM00.cm.cluster>
Co-authored-by: root <root@KDA01.cm.cluster>
2024-06-13 14:19:03 +08:00
Yuekai Zhang
d5be739639
add distill whisper results (#1648) 2024-06-13 00:20:04 +08:00
Fangjun Kuang
13f55d0735
Add merge_tokens for ctc forced alignment (#1649) 2024-06-12 17:45:13 +08:00
Fangjun Kuang
ec0389a3c1
Add doc about FST-based CTC forced alignment. (#1482) 2024-06-12 17:36:57 +08:00
Daniel Povey
4d5c1f2e60
Remove inf from stored stats (#1647) 2024-06-10 22:41:54 +08:00
Fangjun Kuang
130a18cc10
support torch 2.3.1 in docker (#1646) 2024-06-06 22:27:29 +08:00
Fangjun Kuang
b88062292b
Typo fixes (#1643) 2024-06-03 16:49:21 +08:00
zr_jin
42a97f6d7b
Update env.py (#1635) 2024-05-22 22:29:38 +08:00
zr_jin
1adf1e441d
Removed unused `k2` dependencies from the AT recipe (#1633) 2024-05-21 18:22:19 +08:00
Zengwei Yao
0df406c5da
Initialize BiasNorm bias with small random values (#1630) 2024-05-20 22:32:02 +08:00
zr_jin
68980c5d0a
Fix an error occured during mmi preparation (#1626)
* init commit

* updated
2024-05-17 19:45:15 +08:00
zr_jin
9d570870cf
Update asr_datamodule.py (#1619) 2024-05-07 21:37:55 +08:00
Yifan Yang
4e97b19b63
Remove duplicate logging initialization logic in utils.py (#1617) 2024-05-06 13:00:27 +08:00
Zengwei Yao
c08fe48603
add force=True to logging.basicConfig (#1613) 2024-05-04 11:42:23 +08:00
Yuekai Zhang
6d7c1d13a5
update speechio whisper ft results (#1605)
* update speechio whisper ft results
2024-04-30 11:49:20 +08:00
Wei Kang
b49351fc39
Update README.md for conformer-ctc (#1609) 2024-04-28 09:56:13 +08:00
Dongji Gao
9a17f4ce41
add OTC related scripts using phone as units instead of BPEs (#1602)
* add otc related scripts using phone instead of bpe
2024-04-26 00:55:44 +08:00
zzasdf
25cabb7663
fix error in padding computing (#1607) 2024-04-25 22:40:07 +08:00
Xiaoyu Yang
df36f93bd8
add small-scaled model for audio tagging (#1604) 2024-04-24 17:00:42 +08:00
Yifan Yang
368b7d10a7
clear log handlers before setup (#1603) 2024-04-24 15:31:25 +09:00
zr_jin
9f8f0bceb5
Update prepare.sh (#1601) 2024-04-20 23:02:02 +09:00
Yifan Yang
ed6bc200e3
Update train.py (#1590) 2024-04-11 19:35:25 +08:00
Fangjun Kuang
ba5b2e854b
Return probs in audio tagging onnx models (#1586) 2024-04-10 09:03:30 +08:00
Fangjun Kuang
fa5d861af0
Add CI test for the AudioSet recipe. (#1585) 2024-04-09 17:45:00 +08:00
yh646492956
f5d7818733
fix run.sh script in wenetspeech KWS (#1584)
Co-authored-by: Hao You <13182720519@sina.cn>
2024-04-09 15:16:12 +08:00
Xiaoyu Yang
1732dafe24
Add zipformer recipe for audio tagging (#1421) 2024-04-09 12:06:14 +08:00
zr_jin
f2e36ec414
Zipformer recipe for CommonVoice (#1546)
* added scripts for char-based lang prep training scripts

* added `Zipformer` recipe for commonvoice

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-04-09 11:37:08 +08:00
Yifan Yang
87843e9382
k2SSL: a Faster and Better Framework for Self-Supervised Speech Representation Learning (#1500)
* Add k2SSL

* fix flake8

* fix for black

* fix for black

* fix for black

* Update ssl_datamodule.py

* Fix bugs in HubertDataset

* update comments

* add librilight

* add checkpoint convert script

* format

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
Co-authored-by: zzasdf <15218404468@163.com>
2024-04-04 23:29:16 +08:00
Fangjun Kuang
c45e9fecfb
support torch 2.2.2 in docker images (#1578) 2024-04-03 11:26:24 +08:00
Wei Kang
9369c2bef9
Add comments to prepare.sh in aidatatang (#1575) 2024-04-02 16:08:09 +08:00
Dadoou
6cbddaa8e3
Add base choice to model_name argument for whisper model. (#1573)
Co-authored-by: dadoou <dadoou@yandex.com>
2024-04-02 09:47:38 +08:00
Wei Kang
42de459110
Fix decoding finetune model (#1568) 2024-03-26 10:38:21 +08:00