698 Commits

Author SHA1 Message Date
Zengwei Yao
785f3f0bcf
Update RESULTS.md, adding results and model links of zipformer-small/medium CTC/AED models (#1683) 2024-07-09 20:04:47 +08:00
zr_jin
2d64228efa
Update attention_decoder.py (#1681) 2024-07-06 09:01:34 +08:00
Zengwei Yao
f76afff741
Support CTC/AED option for Zipformer recipe (#1389)
* add attention-decoder loss option for zipformer recipe

* add attention-decoder-rescoring

* update export.py and pretrained_ctc.py

* update RESULTS.md
2024-07-05 20:19:18 +08:00
Yifan Yang
cbcac23d26
Fix typos, remove unused packages, normalize comments (#1678) 2024-07-04 14:19:45 +08:00
Manix
eaab2c819f
Zipformer Onnx FP16 (#1671)
Signed-off-by: manickavela29 <manickavela1998@gmail.com>
2024-06-27 16:08:24 +08:00
Fangjun Kuang
3059eb4511
Fix doc URLs (#1660) 2024-06-21 11:10:14 +08:00
Fangjun Kuang
b88062292b
Typo fixes (#1643) 2024-06-03 16:49:21 +08:00
Zengwei Yao
0df406c5da
Initialize BiasNorm bias with small random values (#1630) 2024-05-20 22:32:02 +08:00
zr_jin
68980c5d0a
Fix an error occured during mmi preparation (#1626)
* init commit

* updated
2024-05-17 19:45:15 +08:00
Dongji Gao
9a17f4ce41
add OTC related scripts using phone as units instead of BPEs (#1602)
* add otc related scripts using phone instead of bpe
2024-04-26 00:55:44 +08:00
zzasdf
25cabb7663
fix error in padding computing (#1607) 2024-04-25 22:40:07 +08:00
Yifan Yang
ed6bc200e3
Update train.py (#1590) 2024-04-11 19:35:25 +08:00
Yifan Yang
87843e9382
k2SSL: a Faster and Better Framework for Self-Supervised Speech Representation Learning (#1500)
* Add k2SSL

* fix flake8

* fix for black

* fix for black

* fix for black

* Update ssl_datamodule.py

* Fix bugs in HubertDataset

* update comments

* add librilight

* add checkpoint convert script

* format

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
Co-authored-by: zzasdf <15218404468@163.com>
2024-04-04 23:29:16 +08:00
Zengwei Yao
353469182c
fix issue in zipformer.py (#1566) 2024-03-21 15:59:43 +08:00
Xiaoyu Yang
bddc3fca7a
Fix adapter in streaming_forward (#1560) 2024-03-21 15:08:58 +08:00
Fangjun Kuang
489263e5bb
Add streaming HLG decoding for zipformer CTC. (#1557)
Note it supports only CPU.
2024-03-18 20:11:47 +08:00
Karel Vesely
4917ac8bab
allow export of onnx-streaming-models with other than 80dim input features (#1556) 2024-03-18 18:43:29 +08:00
Xiaoyu Yang
2dfd5dbf8b
Add LoRA for Zipformer (#1540) 2024-03-15 17:19:23 +08:00
zr_jin
eb132da00d
additional instruction for the grad_scale is too small error (#1550) 2024-03-14 11:33:49 +08:00
zr_jin
335a9962de
Fixed formatting issue of PR #1528 (#1530) 2024-03-06 08:43:45 +08:00
Rezakh20
ff430b465f
Add num_features to train.py for training WSASR (#1528) 2024-03-05 16:40:30 +08:00
zr_jin
242002e0bd
Strengthened style constraints (#1527) 2024-03-04 23:28:04 +08:00
Xiaoyu Yang
7e2b561bbf
Add recipe for fine-tuning Zipformer with adapter (#1512) 2024-02-29 10:57:38 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting (#1428)
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech (#1493)
* Refactor prepare.sh in librispeech, break it into three parts,  prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Xiaoyu Yang
777074046d
Fine-tune recipe for Zipformer (#1484)
1. support finetune zipformer
2. update the usage; set a very large batch count
2024-02-06 18:25:43 +08:00
Teo Wen Shen
b9e6327adf
Fixing torch.ctc err (#1485)
* fixing torch.ctc err

* Move targets & lengths to CPU
2024-02-03 06:25:27 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech (#1476)
* Comply to issue #1149

https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 (#1466)
* add decode seamlessm4t

* add requirements

* add decoding with avg model

* add token files

* add custom tokenizer

* support deepspeed to finetune large model

* support large-v3

* add model saving

* using monkey patch to replace models

* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir (#1475) 2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py (#1474) 2024-01-26 15:50:11 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler (#1468)
* Fix buffer size

* Fix for flake8

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead (#1450)
* use shuffled LibriSpeech cuts instead

* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448)
- some AudioTransform classes produce audio signals out of range [-1,+1]
   - Resample produced 1.0079
   - The range [-10,+10] was chosen to still be able to reliably
     distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. (#1447)
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py (#1424) 2023-12-23 00:38:36 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. (#1415) 2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder (#1413)
* Create export-onnx-streaming-ctc.py

* doc_str updated

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
20a82c9abf
first commit (#1411) 2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 (#1410) 2023-12-10 11:38:39 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. (#1407) 2023-12-08 14:29:24 +08:00
LoganLiu66
f08af2fa22
fix initial states (#1398)
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion (#1386) 2023-11-17 18:12:59 +08:00
zr_jin
231bbcd2b6
Update optim.py (#1366) 2023-11-03 12:06:29 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py (#1359)
* init commit

* black formatted

* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc (#848)
* initial commit

* update readme

* Update README.md

* change bool to str2bool for arg parser

* run validation only at the end of epoch

* black format

* black format
2023-10-30 12:09:39 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc (#941)
* merge upstream

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* Update RESULTS.md

Address comments from @csukuangfj

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00