Xiaoyu Yang
7e2b561bbf
Add recipe for fine-tuning Zipformer with adapter ( #1512 )
2024-02-29 10:57:38 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting ( #1428 )
...
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech ( #1493 )
...
* Refactor prepare.sh in librispeech, break it into three parts, prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Xiaoyu Yang
777074046d
Fine-tune recipe for Zipformer ( #1484 )
...
1. support finetune zipformer
2. update the usage; set a very large batch count
2024-02-06 18:25:43 +08:00
Teo Wen Shen
b9e6327adf
Fixing torch.ctc err ( #1485 )
...
* fixing torch.ctc err
* Move targets & lengths to CPU
2024-02-03 06:25:27 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech
( #1476 )
...
* Comply to issue #1149
https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 ( #1466 )
...
* add decode seamlessm4t
* add requirements
* add decoding with avg model
* add token files
* add custom tokenizer
* support deepspeed to finetune large model
* support large-v3
* add model saving
* using monkey patch to replace models
* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir ( #1475 )
2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py ( #1474 )
2024-01-26 15:50:11 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler ( #1468 )
...
* Fix buffer size
* Fix for flake8
---------
Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead ( #1450 )
...
* use shuffled LibriSpeech cuts instead
* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] ( #1448 )
...
- some AudioTransform classes produce audio signals out of range [-1,+1]
- Resample produced 1.0079
- The range [-10,+10] was chosen to still be able to reliably
distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. ( #1447 )
...
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py ( #1424 )
2023-12-23 00:38:36 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. ( #1415 )
2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder ( #1413 )
...
* Create export-onnx-streaming-ctc.py
* doc_str updated
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
20a82c9abf
first commit ( #1411 )
2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 ( #1410 )
2023-12-10 11:38:39 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. ( #1407 )
2023-12-08 14:29:24 +08:00
LoganLiu66
f08af2fa22
fix initial states ( #1398 )
...
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords ( #1385 )
...
* add custom score for each hotword
* Add more comments
* Fix deocde
* fix style
* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion ( #1386 )
2023-11-17 18:12:59 +08:00
zr_jin
231bbcd2b6
Update optim.py ( #1366 )
2023-11-03 12:06:29 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py
( #1359 )
...
* init commit
* black formatted
* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs ( #1354 )
...
* incorporate https://github.com/k2-fsa/icefall/pull/1269
* incorporate https://github.com/k2-fsa/icefall/pull/1301
* black formatted
* incorporate https://github.com/k2-fsa/icefall/pull/1162
* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc ( #848 )
...
* initial commit
* update readme
* Update README.md
* change bool to str2bool for arg parser
* run validation only at the end of epoch
* black format
* black format
2023-10-30 12:09:39 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc ( #941 )
...
* merge upstream
* initial commit for zipformer_ctc
* remove unwanted changes
* remove changes to other recipe
* fix zipformer softlink
* fix for JIT export
* add missing file
* fix symbolic links
* update results
* Update RESULTS.md
Address comments from @csukuangfj
---------
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech ( #1343 )
...
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
1814bbb0e7
typo fixed ( #1334 )
2023-10-25 00:03:33 +08:00
zr_jin
f9980aa606
minor fixes ( #1332 )
2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support ( #1329 )
2023-10-24 01:10:50 +08:00
Karel Vesely
543b4cc1ca
small enhanecements ( #1322 )
...
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
errs)
2023-10-19 21:53:31 +08:00
marcoyang1998
52c24df61d
Fix model avg ( #1317 )
...
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model
* only match the exact module prefix
2023-10-18 17:36:14 +08:00
Erwan Zerhouni
807816fec0
Fix chunk issue for sherpa ( #1316 )
2023-10-18 16:07:10 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse ( #1314 )
2023-10-17 21:22:32 +08:00
zr_jin
162ceaf4b3
fixes for data preparation ( #1307 )
...
Issue: #1306
2023-10-12 17:05:41 +08:00
Wen Ding
2b3c5d799f
Fix padding issues ( #1303 )
2023-10-11 16:58:00 +08:00
Fangjun Kuang
cb874e9905
add export-onnx.py for stateless8 ( #1302 )
...
* add export-onnx.py for stateless8
* use tokens.txt to replace bpe.model
2023-10-11 12:20:12 +08:00
Zengwei Yao
9af144c26b
Zipformer update result ( #1296 )
...
* update Zipformer results
2023-10-09 23:15:22 +08:00
zr_jin
fefffc02f6
Update optim.py ( #1292 )
2023-10-09 17:39:23 +08:00
Fangjun Kuang
109354b6b8
Add CTC HLG decoding for zipformer ( #1287 )
2023-10-02 14:00:06 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc ( #1279 )
2023-10-01 13:46:16 +08:00
Fangjun Kuang
772ee3955b
Support HLG decoding using OpenFst with kaldi decoders ( #1275 )
2023-09-27 14:49:27 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
marcoyang1998
e17f884ace
Fix docs for MVQ ( #1272 )
...
* typo fix
2023-09-25 15:36:40 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe ( #1270 )
...
* formatted the entire librispeech recipe
* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions
( #1269 )
...
* fixes for `diagnostics`
Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`
also black formatted some scripts
* fixed formatting issues
2023-09-24 17:06:47 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx ( #1264 )
...
* Use torch.jit.script() to export the decoder model
See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests ( #1266 )
2023-09-21 21:16:14 +08:00
l2009312042
45d60ef262
Update conformer.py ( #1200 )
...
* Update conformer.py
* Update zipformer.py
fix bug in get_dynamic_dropout_rate
2023-09-21 19:41:10 +08:00