Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech ( #1343 )
...
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
1814bbb0e7
typo fixed ( #1334 )
2023-10-25 00:03:33 +08:00
zr_jin
f9980aa606
minor fixes ( #1332 )
2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support ( #1329 )
2023-10-24 01:10:50 +08:00
Karel Vesely
543b4cc1ca
small enhanecements ( #1322 )
...
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
errs)
2023-10-19 21:53:31 +08:00
marcoyang1998
52c24df61d
Fix model avg ( #1317 )
...
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model
* only match the exact module prefix
2023-10-18 17:36:14 +08:00
Erwan Zerhouni
807816fec0
Fix chunk issue for sherpa ( #1316 )
2023-10-18 16:07:10 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse ( #1314 )
2023-10-17 21:22:32 +08:00
zr_jin
162ceaf4b3
fixes for data preparation ( #1307 )
...
Issue: #1306
2023-10-12 17:05:41 +08:00
Wen Ding
2b3c5d799f
Fix padding issues ( #1303 )
2023-10-11 16:58:00 +08:00
Fangjun Kuang
cb874e9905
add export-onnx.py for stateless8 ( #1302 )
...
* add export-onnx.py for stateless8
* use tokens.txt to replace bpe.model
2023-10-11 12:20:12 +08:00
zr_jin
103d617380
bug fixes ( #1301 )
2023-10-11 11:04:20 +08:00
Zengwei Yao
9af144c26b
Zipformer update result ( #1296 )
...
* update Zipformer results
2023-10-09 23:15:22 +08:00
zr_jin
fefffc02f6
Update optim.py ( #1292 )
2023-10-09 17:39:23 +08:00
Fangjun Kuang
109354b6b8
Add CTC HLG decoding for zipformer ( #1287 )
2023-10-02 14:00:06 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc ( #1279 )
2023-10-01 13:46:16 +08:00
Dongji Gao
3abc290c11
Add scripts and recipe for BTC/OTC ( #1255 )
2023-09-29 07:52:46 +08:00
Fangjun Kuang
772ee3955b
Support HLG decoding using OpenFst with kaldi decoders ( #1275 )
2023-09-27 14:49:27 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
marcoyang1998
e17f884ace
Fix docs for MVQ ( #1272 )
...
* typo fix
2023-09-25 15:36:40 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe ( #1270 )
...
* formatted the entire librispeech recipe
* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions
( #1269 )
...
* fixes for `diagnostics`
Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`
also black formatted some scripts
* fixed formatting issues
2023-09-24 17:06:47 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx ( #1264 )
...
* Use torch.jit.script() to export the decoder model
See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests ( #1266 )
2023-09-21 21:16:14 +08:00
l2009312042
45d60ef262
Update conformer.py ( #1200 )
...
* Update conformer.py
* Update zipformer.py
fix bug in get_dynamic_dropout_rate
2023-09-21 19:41:10 +08:00
zr_jin
bbb03f7962
Update decoder.py ( #1262 )
2023-09-20 08:15:54 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release ( #1249 )
2023-09-13 12:39:49 +08:00
zr_jin
0f1bc6f8af
Multi_zh-Hans Recipe ( #1238 )
...
* Init commit for recipes trained on multiple zh datasets.
* fbank extraction for thchs30
* added support for aishell1
* added support for aishell-2
* fixes
* fixes
* fixes
* added support for stcmds and primewords
* fixes
* added support for magicdata
script for fbank computation not done yet
* added script for magicdata fbank computation
* file permission fixed
* updated for the wenetspeech recipe
* updated
* Update preprocess_kespeech.py
* updated
* updated
* updated
* updated
* file permission fixed
* updated paths
* fixes
* added support for kespeech dev/test set fbank computation
* fixes for file permission
* refined support for KeSpeech
* added scripts for BPE model training
* updated
* init commit for the multi_zh-cn zipformer recipe
* disable speed perturbation by default
* updated
* updated
* added necessary files for the zipformer recipe
* removed redundant wenetspeech M and S sets
* updates for multi dataset decoding
* refined
* formatting issues fixed
* updated
* minor fixes
* this commit finalize the recipe (hopefully)
* fixed formatting issues
* minor fixes
* updated
* using soft links to reduce redundancy
* minor updates
* using soft links to reduce redundancy
* minor updates
* minor updates
* using soft links to reduce redundancy
* minor updates
* Update README.md
* minor updates
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* minor updates
* minor fixes
* fixed a formatting issue
* Update preprocess_kespeech.py
* Update prepare.sh
* Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* removed redundant files
* symlinks added
* minor updates
* added CI tests for `multi_zh-hans`
* minor fixes
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-09-13 11:57:05 +08:00
zr_jin
d50a9ea030
doc str fixes ( #1241 )
2023-09-07 16:34:53 +08:00
Wei Kang
4d7f73ce65
Add context biasing for zipformer recipe ( #1204 )
...
* Add context biasing for zipformer recipe
* support context biasing in modified_beam_search_LODR
* fix context graph
* Minor fixes
2023-08-28 19:37:32 +08:00
Erwan Zerhouni
9a47c08d08
Update padding modified beam search ( #1217 )
2023-08-14 16:10:50 +02:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model ( #1162 )
2023-08-12 16:53:59 +08:00
Yifan Yang
00256a7669
Fix decode_stream.py ( #1208 )
...
* FIx decode_stream.py
* Update decode_stream.py
2023-08-09 09:40:58 +08:00
marcoyang1998
1ee251c8b3
Decode zipformer with external LMs ( #1193 )
...
* update some documentation
* support decoding with LMs in zipformer recipe
* update RESULTS.md
2023-08-03 15:50:35 +08:00
Fangjun Kuang
1dbbd7759e
Add tests for subsample.py and fix typos ( #1180 )
2023-07-25 14:46:18 +08:00
zr_jin
4ab7d61008
removed batch_name
to fix a KeyError with "uttid" ( #1172 )
2023-07-15 12:39:32 +08:00
Yifan Yang
ffe816e2a8
Fix blank skip ci test ( #1167 )
...
* Fix for ci
* Fix frame_reducer
2023-07-06 23:12:41 +08:00
Fangjun Kuang
130ad0319d
Fix CI test for zipformer CTC ( #1165 )
2023-07-05 10:38:29 +08:00
Fangjun Kuang
b8a17944e4
Fix zipformer CI test ( #1164 )
2023-07-05 10:23:35 +08:00
Fangjun Kuang
9009d028a0
Fix ONNX export for the latest non-streaming zipformer. ( #1160 )
2023-07-03 23:56:51 +08:00
Fangjun Kuang
c3e23ec8d2
Fix logaddexp for ONNX export ( #1158 )
2023-07-02 10:30:09 +08:00
MicKot
98d89463f6
zipformer2 logaddexp onnx safe ( #1157 )
2023-06-30 21:16:40 +08:00
Zengwei Yao
ccd8c624dd
support testing onnx exported model on the test sets ( #1150 )
...
* support testing onnx exported model on the test sets
* use token_table instead
2023-06-30 12:05:37 +08:00
Wei Kang
db71b03026
Support int8 quantization in decoder ( #1152 )
2023-06-29 16:48:59 +08:00
Desh Raj
9c2172c1c4
Zipformer for TedLium ( #1125 )
...
* initial commit for zipformer tedlium
* fix unk decoding
* add pretrained model and logs
* update for new AsrModel
* add option for choosing rnnt type
* add results with modified rnnt
2023-06-28 16:43:49 +08:00
Fangjun Kuang
968ebd236b
Fix ONNX export of the latest streaming zipformer model. ( #1148 )
2023-06-27 14:35:59 +08:00
Wei Kang
219bba1310
zipformer wenetspeech ( #1130 )
...
* copy files
* update train.py
* small fixes
* Add decode.py
* Fix dataloader in decode.py
* add blank penalty
* Add blank-penalty to other decoding method
* Minor fixes
* add zipformer2 recipe
* Minor fixes
* Remove pruned7
* export and test models
* Replace bpe with tokens in export.py and pretrain.py
* Minor fixes
* Minor fixes
* Minor fixes
* Fix export
* Update results
* Fix zipformer-ctc
* Fix ci
* Fix ci
* Fix CI
* Fix CI
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
frankyoujian
4d5b8369ae
fix small typo ( #1144 )
2023-06-21 17:17:19 +08:00
Yifan Yang
d667dc365b
Fix for diagnostic ( #1135 )
...
* CTC loss return tensor
* Update model.py
2023-06-16 15:04:41 +08:00
Yifan Yang
0a465794a8
Fix Zipformer ( #1132 )
...
* Update model.py
* Update train.py
* Update decoder.py
2023-06-15 17:52:14 +08:00