724 Commits

Author SHA1 Message Date
Xiaoyu Yang
7e2b561bbf
Add recipe for fine-tuning Zipformer with adapter (#1512) 2024-02-29 10:57:38 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting (#1428)
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech (#1493)
* Refactor prepare.sh in librispeech, break it into three parts,  prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Xiaoyu Yang
777074046d
Fine-tune recipe for Zipformer (#1484)
1. support finetune zipformer
2. update the usage; set a very large batch count
2024-02-06 18:25:43 +08:00
Teo Wen Shen
b9e6327adf
Fixing torch.ctc err (#1485)
* fixing torch.ctc err

* Move targets & lengths to CPU
2024-02-03 06:25:27 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech (#1476)
* Comply to issue #1149

https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 (#1466)
* add decode seamlessm4t

* add requirements

* add decoding with avg model

* add token files

* add custom tokenizer

* support deepspeed to finetune large model

* support large-v3

* add model saving

* using monkey patch to replace models

* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir (#1475) 2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py (#1474) 2024-01-26 15:50:11 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler (#1468)
* Fix buffer size

* Fix for flake8

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead (#1450)
* use shuffled LibriSpeech cuts instead

* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448)
- some AudioTransform classes produce audio signals out of range [-1,+1]
   - Resample produced 1.0079
   - The range [-10,+10] was chosen to still be able to reliably
     distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. (#1447)
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py (#1424) 2023-12-23 00:38:36 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. (#1415) 2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder (#1413)
* Create export-onnx-streaming-ctc.py

* doc_str updated

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
20a82c9abf
first commit (#1411) 2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 (#1410) 2023-12-10 11:38:39 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. (#1407) 2023-12-08 14:29:24 +08:00
LoganLiu66
f08af2fa22
fix initial states (#1398)
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion (#1386) 2023-11-17 18:12:59 +08:00
zr_jin
231bbcd2b6
Update optim.py (#1366) 2023-11-03 12:06:29 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py (#1359)
* init commit

* black formatted

* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc (#848)
* initial commit

* update readme

* Update README.md

* change bool to str2bool for arg parser

* run validation only at the end of epoch

* black format

* black format
2023-10-30 12:09:39 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc (#941)
* merge upstream

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* Update RESULTS.md

Address comments from @csukuangfj

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
1814bbb0e7
typo fixed (#1334) 2023-10-25 00:03:33 +08:00
zr_jin
f9980aa606
minor fixes (#1332) 2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support (#1329) 2023-10-24 01:10:50 +08:00
Karel Vesely
543b4cc1ca
small enhanecements (#1322)
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
  errs)
2023-10-19 21:53:31 +08:00
marcoyang1998
52c24df61d
Fix model avg (#1317)
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model

* only match the exact module prefix
2023-10-18 17:36:14 +08:00
Erwan Zerhouni
807816fec0
Fix chunk issue for sherpa (#1316) 2023-10-18 16:07:10 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse (#1314) 2023-10-17 21:22:32 +08:00
zr_jin
162ceaf4b3
fixes for data preparation (#1307)
Issue: #1306
2023-10-12 17:05:41 +08:00
Wen Ding
2b3c5d799f
Fix padding issues (#1303) 2023-10-11 16:58:00 +08:00
Fangjun Kuang
cb874e9905
add export-onnx.py for stateless8 (#1302)
* add export-onnx.py for stateless8

* use tokens.txt to replace bpe.model
2023-10-11 12:20:12 +08:00
Zengwei Yao
9af144c26b
Zipformer update result (#1296)
* update Zipformer results
2023-10-09 23:15:22 +08:00
zr_jin
fefffc02f6
Update optim.py (#1292) 2023-10-09 17:39:23 +08:00
Fangjun Kuang
109354b6b8
Add CTC HLG decoding for zipformer (#1287) 2023-10-02 14:00:06 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc (#1279) 2023-10-01 13:46:16 +08:00
Fangjun Kuang
772ee3955b
Support HLG decoding using OpenFst with kaldi decoders (#1275) 2023-09-27 14:49:27 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. (#1244) 2023-09-26 16:36:19 +08:00
marcoyang1998
e17f884ace
Fix docs for MVQ (#1272)
* typo fix
2023-09-25 15:36:40 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe (#1270)
* formatted the entire librispeech recipe

* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions (#1269)
* fixes for `diagnostics`

Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`

also black formatted some scripts

* fixed formatting issues
2023-09-24 17:06:47 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx (#1264)
* Use torch.jit.script() to export the decoder model

See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests (#1266) 2023-09-21 21:16:14 +08:00
l2009312042
45d60ef262
Update conformer.py (#1200)
* Update conformer.py
* Update zipformer.py

fix bug in get_dynamic_dropout_rate
2023-09-21 19:41:10 +08:00