96 Commits

Author SHA1 Message Date
Yuekai Zhang
4826f0801c remove utterance more than 30s in test_net 2024-01-31 14:02:39 +08:00
Yuekai Zhang
d8a329eca5 decode all wav files 2024-01-31 14:02:39 +08:00
Yuekai Zhang
341c29e6e2 fix whisper version to support multi batch beam 2024-01-31 14:02:39 +08:00
Yuekai Zhang
c19891ee8e add remove long short 2024-01-31 14:02:39 +08:00
Yuekai Zhang
bb07b65e45 add remove long short 2024-01-31 14:02:39 +08:00
Yuekai Zhang
1600f7db95 fix too long audios 2024-01-31 14:02:39 +08:00
Yuekai Zhang
b76cd65abf fix subsampling factor 2024-01-31 14:02:39 +08:00
Yuekai Zhang
ad796d929d remove useless file 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e49534f2dd add monkey patch codes 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e1a55b945b add wenetspeech fine-tune scripts 2024-01-31 14:02:39 +08:00
Yuekai Zhang
baa7c5fb8d use multi machines 2024-01-31 14:02:39 +08:00
Yuekai Zhang
cf85019290 parallel jobs 2024-01-31 14:02:39 +08:00
Yuekai Zhang
df54121c41 fix io issue 2024-01-31 14:02:39 +08:00
Yuekai Zhang
af29455c3d add kaldifeatwhisper fbank 2024-01-31 14:02:39 +08:00
Yuekai Zhang
08db3051ad regression 2024-01-31 14:02:39 +08:00
Yuekai Zhang
f66b266aa4 fix executor 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e46e9b77ee fix overwrite 2024-01-31 14:02:39 +08:00
Yuekai Zhang
fd77c5758c change compute feature batch 2024-01-31 14:02:39 +08:00
Yuekai Zhang
f4cf9fb2d3 add aishell2 feat 2024-01-31 14:02:39 +08:00
Yuekai Zhang
aa7b17e410 test feature extractor speed 2024-01-31 14:02:39 +08:00
Yuekai Zhang
d1b010463c add original model decode with 30s 2024-01-31 14:02:39 +08:00
Yuekai Zhang
38f5f45c67 add requirments.txt 2024-01-31 14:02:39 +08:00
Yuekai Zhang
72c9d01724 add decode for wenetspeech 2024-01-31 14:02:39 +08:00
Yuekai Zhang
046e071ca3 add str to bool 2024-01-31 14:02:39 +08:00
Yuekai Zhang
315175a362 add whisper fbank for other dataset 2024-01-31 14:02:39 +08:00
Yuekai Zhang
e43c4da91d add whisper fbank for wenetspeech 2024-01-31 14:02:39 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech (#1476)
* Comply to issue #1149

https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir (#1475) 2024-01-26 19:18:33 +08:00
zr_jin
9c494a3329
typos fixed (#1472) 2024-01-25 18:41:43 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler (#1468)
* Fix buffer size

* Fix for flake8

---------

Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448)
- some AudioTransform classes produce audio signals out of range [-1,+1]
   - Resample produced 1.0079
   - The range [-10,+10] was chosen to still be able to reliably
     distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. (#1447)
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
wnywbyt
c3bbb32f9e
Update the parameter 'vocab-size' (#1364)
Co-authored-by: wdq <dongqin.wan@desaysv.com>
2023-11-02 20:45:30 +08:00
zr_jin
1814bbb0e7
typo fixed (#1334) 2023-10-25 00:03:33 +08:00
Rudra
eef47adee9
fix typo (#1324) 2023-10-19 22:54:43 +08:00
marcoyang1998
52c24df61d
Fix model avg (#1317)
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model

* only match the exact module prefix
2023-10-18 17:36:14 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse (#1314) 2023-10-17 21:22:32 +08:00
zr_jin
855492156a
Update finetune.py (#1304) 2023-10-12 16:48:23 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions (#1269)
* fixes for `diagnostics`

Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`

also black formatted some scripts

* fixed formatting issues
2023-09-24 17:06:47 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx (#1264)
* Use torch.jit.script() to export the decoder model

See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests (#1266) 2023-09-21 21:16:14 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release (#1249) 2023-09-13 12:39:49 +08:00
zr_jin
9ef8145fa3
minor fixes (#1240) 2023-09-04 17:56:05 +08:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model (#1162) 2023-08-12 16:53:59 +08:00
zr_jin
74806b744b
disable speed perturbation by default (#1176)
* disable speed perturbation by default

* minor fixes

* minor updates

* updated bash scripts to incorporate with the `speed-perturb` arg

* minor fixes

1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe

>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)

2. changed arg type for `perturb-speed` to str2bool
2023-08-10 20:56:02 +08:00
marcoyang1998
5ed6fc0e6d
add sym link (#1170) 2023-07-12 15:37:14 +08:00
Wei Kang
219bba1310
zipformer wenetspeech (#1130)
* copy files

* update train.py

* small fixes

* Add decode.py

* Fix dataloader in decode.py

* add blank penalty

* Add blank-penalty to other decoding method

* Minor fixes

* add zipformer2 recipe

* Minor fixes

* Remove pruned7

* export and test models

* Replace bpe with tokens in export.py and pretrain.py

* Minor fixes

* Minor fixes

* Minor fixes

* Fix export

* Update results

* Fix zipformer-ctc

* Fix ci

* Fix ci

* Fix CI

* Fix CI

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
Wei Kang
ba257efbcd
Add Context biasing (#1038)
* Add context biasing for librispeech

* Add context biasing for wenetspeech

* fix bugs

* Implement Aho-Corasick context graph

* fix some bugs

* Fixes to forward_one_step; add draw to context graph

* add output arc; fix black

* Fix wenetspeech tokenizer

* Minor fixes to the decode.py
2023-06-03 21:28:49 +08:00
Fangjun Kuang
7b0afbdc16
Remove cur_batch_idx (#1102) 2023-05-30 14:49:54 +08:00