690 Commits

Author SHA1 Message Date
zr_jin
2d64228efa
Update attention_decoder.py (#1681) 2024-07-06 09:01:34 +08:00
Zengwei Yao
f76afff741
Support CTC/AED option for Zipformer recipe (#1389)
* add attention-decoder loss option for zipformer recipe

* add attention-decoder-rescoring

* update export.py and pretrained_ctc.py

* update RESULTS.md
2024-07-05 20:19:18 +08:00
Yifan Yang
cbcac23d26
Fix typos, remove unused packages, normalize comments (#1678) 2024-07-04 14:19:45 +08:00
Manix
eaab2c819f
Zipformer Onnx FP16 (#1671)
Signed-off-by: manickavela29 <manickavela1998@gmail.com>
2024-06-27 16:08:24 +08:00
Fangjun Kuang
3059eb4511
Fix doc URLs (#1660) 2024-06-21 11:10:14 +08:00
Fangjun Kuang
b88062292b
Typo fixes (#1643) 2024-06-03 16:49:21 +08:00
Zengwei Yao
0df406c5da
Initialize BiasNorm bias with small random values (#1630) 2024-05-20 22:32:02 +08:00
zr_jin
68980c5d0a
Fix an error occured during mmi preparation (#1626)
* init commit

* updated
2024-05-17 19:45:15 +08:00
Yifan Yang
ed6bc200e3
Update train.py (#1590) 2024-04-11 19:35:25 +08:00
Zengwei Yao
353469182c
fix issue in zipformer.py (#1566) 2024-03-21 15:59:43 +08:00
Xiaoyu Yang
bddc3fca7a
Fix adapter in streaming_forward (#1560) 2024-03-21 15:08:58 +08:00
Fangjun Kuang
489263e5bb
Add streaming HLG decoding for zipformer CTC. (#1557)
Note it supports only CPU.
2024-03-18 20:11:47 +08:00
Karel Vesely
4917ac8bab
allow export of onnx-streaming-models with other than 80dim input features (#1556) 2024-03-18 18:43:29 +08:00
Xiaoyu Yang
2dfd5dbf8b
Add LoRA for Zipformer (#1540) 2024-03-15 17:19:23 +08:00
zr_jin
eb132da00d
additional instruction for the grad_scale is too small error (#1550) 2024-03-14 11:33:49 +08:00
zr_jin
242002e0bd
Strengthened style constraints (#1527) 2024-03-04 23:28:04 +08:00
Xiaoyu Yang
7e2b561bbf
Add recipe for fine-tuning Zipformer with adapter (#1512) 2024-02-29 10:57:38 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting (#1428)
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech (#1493)
* Refactor prepare.sh in librispeech, break it into three parts,  prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Xiaoyu Yang
777074046d
Fine-tune recipe for Zipformer (#1484)
1. support finetune zipformer
2. update the usage; set a very large batch count
2024-02-06 18:25:43 +08:00
Teo Wen Shen
b9e6327adf
Fixing torch.ctc err (#1485)
* fixing torch.ctc err

* Move targets & lengths to CPU
2024-02-03 06:25:27 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech (#1476)
* Comply to issue #1149

https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 (#1466)
* add decode seamlessm4t

* add requirements

* add decoding with avg model

* add token files

* add custom tokenizer

* support deepspeed to finetune large model

* support large-v3

* add model saving

* using monkey patch to replace models

* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir (#1475) 2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py (#1474) 2024-01-26 15:50:11 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler (#1468)
* Fix buffer size

* Fix for flake8

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead (#1450)
* use shuffled LibriSpeech cuts instead

* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448)
- some AudioTransform classes produce audio signals out of range [-1,+1]
   - Resample produced 1.0079
   - The range [-10,+10] was chosen to still be able to reliably
     distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. (#1447)
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py (#1424) 2023-12-23 00:38:36 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. (#1415) 2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder (#1413)
* Create export-onnx-streaming-ctc.py

* doc_str updated

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
20a82c9abf
first commit (#1411) 2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 (#1410) 2023-12-10 11:38:39 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. (#1407) 2023-12-08 14:29:24 +08:00
LoganLiu66
f08af2fa22
fix initial states (#1398)
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion (#1386) 2023-11-17 18:12:59 +08:00
zr_jin
231bbcd2b6
Update optim.py (#1366) 2023-11-03 12:06:29 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py (#1359)
* init commit

* black formatted

* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc (#848)
* initial commit

* update readme

* Update README.md

* change bool to str2bool for arg parser

* run validation only at the end of epoch

* black format

* black format
2023-10-30 12:09:39 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc (#941)
* merge upstream

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* Update RESULTS.md

Address comments from @csukuangfj

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
1814bbb0e7
typo fixed (#1334) 2023-10-25 00:03:33 +08:00
zr_jin
f9980aa606
minor fixes (#1332) 2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support (#1329) 2023-10-24 01:10:50 +08:00
Karel Vesely
543b4cc1ca
small enhanecements (#1322)
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
  errs)
2023-10-19 21:53:31 +08:00
marcoyang1998
52c24df61d
Fix model avg (#1317)
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model

* only match the exact module prefix
2023-10-18 17:36:14 +08:00
Erwan Zerhouni
807816fec0
Fix chunk issue for sherpa (#1316) 2023-10-18 16:07:10 +08:00