LoganLiu66
f08af2fa22
fix initial states ( #1398 )
...
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords ( #1385 )
...
* add custom score for each hotword
* Add more comments
* Fix deocde
* fix style
* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion ( #1386 )
2023-11-17 18:12:59 +08:00
zr_jin
231bbcd2b6
Update optim.py ( #1366 )
2023-11-03 12:06:29 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py
( #1359 )
...
* init commit
* black formatted
* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs ( #1354 )
...
* incorporate https://github.com/k2-fsa/icefall/pull/1269
* incorporate https://github.com/k2-fsa/icefall/pull/1301
* black formatted
* incorporate https://github.com/k2-fsa/icefall/pull/1162
* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc ( #848 )
...
* initial commit
* update readme
* Update README.md
* change bool to str2bool for arg parser
* run validation only at the end of epoch
* black format
* black format
2023-10-30 12:09:39 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc ( #941 )
...
* merge upstream
* initial commit for zipformer_ctc
* remove unwanted changes
* remove changes to other recipe
* fix zipformer softlink
* fix for JIT export
* add missing file
* fix symbolic links
* update results
* Update RESULTS.md
Address comments from @csukuangfj
---------
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech ( #1343 )
...
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
1814bbb0e7
typo fixed ( #1334 )
2023-10-25 00:03:33 +08:00
zr_jin
f9980aa606
minor fixes ( #1332 )
2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support ( #1329 )
2023-10-24 01:10:50 +08:00
Karel Vesely
543b4cc1ca
small enhanecements ( #1322 )
...
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
errs)
2023-10-19 21:53:31 +08:00
marcoyang1998
52c24df61d
Fix model avg ( #1317 )
...
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model
* only match the exact module prefix
2023-10-18 17:36:14 +08:00
Erwan Zerhouni
807816fec0
Fix chunk issue for sherpa ( #1316 )
2023-10-18 16:07:10 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse ( #1314 )
2023-10-17 21:22:32 +08:00
zr_jin
162ceaf4b3
fixes for data preparation ( #1307 )
...
Issue: #1306
2023-10-12 17:05:41 +08:00
Wen Ding
2b3c5d799f
Fix padding issues ( #1303 )
2023-10-11 16:58:00 +08:00
Fangjun Kuang
cb874e9905
add export-onnx.py for stateless8 ( #1302 )
...
* add export-onnx.py for stateless8
* use tokens.txt to replace bpe.model
2023-10-11 12:20:12 +08:00
Zengwei Yao
9af144c26b
Zipformer update result ( #1296 )
...
* update Zipformer results
2023-10-09 23:15:22 +08:00
zr_jin
fefffc02f6
Update optim.py ( #1292 )
2023-10-09 17:39:23 +08:00
Fangjun Kuang
109354b6b8
Add CTC HLG decoding for zipformer ( #1287 )
2023-10-02 14:00:06 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc ( #1279 )
2023-10-01 13:46:16 +08:00
Fangjun Kuang
772ee3955b
Support HLG decoding using OpenFst with kaldi decoders ( #1275 )
2023-09-27 14:49:27 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
marcoyang1998
e17f884ace
Fix docs for MVQ ( #1272 )
...
* typo fix
2023-09-25 15:36:40 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe ( #1270 )
...
* formatted the entire librispeech recipe
* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions
( #1269 )
...
* fixes for `diagnostics`
Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`
also black formatted some scripts
* fixed formatting issues
2023-09-24 17:06:47 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx ( #1264 )
...
* Use torch.jit.script() to export the decoder model
See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests ( #1266 )
2023-09-21 21:16:14 +08:00
l2009312042
45d60ef262
Update conformer.py ( #1200 )
...
* Update conformer.py
* Update zipformer.py
fix bug in get_dynamic_dropout_rate
2023-09-21 19:41:10 +08:00
zr_jin
bbb03f7962
Update decoder.py ( #1262 )
2023-09-20 08:15:54 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release ( #1249 )
2023-09-13 12:39:49 +08:00
zr_jin
0f1bc6f8af
Multi_zh-Hans Recipe ( #1238 )
...
* Init commit for recipes trained on multiple zh datasets.
* fbank extraction for thchs30
* added support for aishell1
* added support for aishell-2
* fixes
* fixes
* fixes
* added support for stcmds and primewords
* fixes
* added support for magicdata
script for fbank computation not done yet
* added script for magicdata fbank computation
* file permission fixed
* updated for the wenetspeech recipe
* updated
* Update preprocess_kespeech.py
* updated
* updated
* updated
* updated
* file permission fixed
* updated paths
* fixes
* added support for kespeech dev/test set fbank computation
* fixes for file permission
* refined support for KeSpeech
* added scripts for BPE model training
* updated
* init commit for the multi_zh-cn zipformer recipe
* disable speed perturbation by default
* updated
* updated
* added necessary files for the zipformer recipe
* removed redundant wenetspeech M and S sets
* updates for multi dataset decoding
* refined
* formatting issues fixed
* updated
* minor fixes
* this commit finalize the recipe (hopefully)
* fixed formatting issues
* minor fixes
* updated
* using soft links to reduce redundancy
* minor updates
* using soft links to reduce redundancy
* minor updates
* minor updates
* using soft links to reduce redundancy
* minor updates
* Update README.md
* minor updates
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* minor updates
* minor fixes
* fixed a formatting issue
* Update preprocess_kespeech.py
* Update prepare.sh
* Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* removed redundant files
* symlinks added
* minor updates
* added CI tests for `multi_zh-hans`
* minor fixes
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-09-13 11:57:05 +08:00
zr_jin
d50a9ea030
doc str fixes ( #1241 )
2023-09-07 16:34:53 +08:00
Wei Kang
4d7f73ce65
Add context biasing for zipformer recipe ( #1204 )
...
* Add context biasing for zipformer recipe
* support context biasing in modified_beam_search_LODR
* fix context graph
* Minor fixes
2023-08-28 19:37:32 +08:00
Erwan Zerhouni
9a47c08d08
Update padding modified beam search ( #1217 )
2023-08-14 16:10:50 +02:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model ( #1162 )
2023-08-12 16:53:59 +08:00
Yifan Yang
00256a7669
Fix decode_stream.py ( #1208 )
...
* FIx decode_stream.py
* Update decode_stream.py
2023-08-09 09:40:58 +08:00
marcoyang1998
1ee251c8b3
Decode zipformer with external LMs ( #1193 )
...
* update some documentation
* support decoding with LMs in zipformer recipe
* update RESULTS.md
2023-08-03 15:50:35 +08:00
Fangjun Kuang
1dbbd7759e
Add tests for subsample.py and fix typos ( #1180 )
2023-07-25 14:46:18 +08:00
zr_jin
4ab7d61008
removed batch_name
to fix a KeyError with "uttid" ( #1172 )
2023-07-15 12:39:32 +08:00
Yifan Yang
ffe816e2a8
Fix blank skip ci test ( #1167 )
...
* Fix for ci
* Fix frame_reducer
2023-07-06 23:12:41 +08:00
Fangjun Kuang
130ad0319d
Fix CI test for zipformer CTC ( #1165 )
2023-07-05 10:38:29 +08:00
Fangjun Kuang
b8a17944e4
Fix zipformer CI test ( #1164 )
2023-07-05 10:23:35 +08:00
Fangjun Kuang
9009d028a0
Fix ONNX export for the latest non-streaming zipformer. ( #1160 )
2023-07-03 23:56:51 +08:00
Fangjun Kuang
c3e23ec8d2
Fix logaddexp for ONNX export ( #1158 )
2023-07-02 10:30:09 +08:00
MicKot
98d89463f6
zipformer2 logaddexp onnx safe ( #1157 )
2023-06-30 21:16:40 +08:00
Zengwei Yao
ccd8c624dd
support testing onnx exported model on the test sets ( #1150 )
...
* support testing onnx exported model on the test sets
* use token_table instead
2023-06-30 12:05:37 +08:00
Wei Kang
db71b03026
Support int8 quantization in decoder ( #1152 )
2023-06-29 16:48:59 +08:00