Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
zr_jin
1b565dd251
added softlinks to local dir ( #1273 )
2023-09-26 15:41:39 +08:00
marcoyang1998
e17f884ace
Fix docs for MVQ ( #1272 )
...
* typo fix
2023-09-25 15:36:40 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe ( #1270 )
...
* formatted the entire librispeech recipe
* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions
( #1269 )
...
* fixes for `diagnostics`
Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`
also black formatted some scripts
* fixed formatting issues
2023-09-24 17:06:47 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx ( #1264 )
...
* Use torch.jit.script() to export the decoder model
See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests ( #1266 )
2023-09-21 21:16:14 +08:00
l2009312042
45d60ef262
Update conformer.py ( #1200 )
...
* Update conformer.py
* Update zipformer.py
fix bug in get_dynamic_dropout_rate
2023-09-21 19:41:10 +08:00
zr_jin
bbb03f7962
Update decoder.py ( #1262 )
2023-09-20 08:15:54 +08:00
Tiance Wang
7e1288af50
fix thchs-30 download command ( #1260 )
2023-09-19 16:46:36 +08:00
zr_jin
565d2c2f5b
Minor fixes to the libricss recipe ( #1256 )
2023-09-15 02:37:53 +08:00
docterstrange
fba1710622
modify tal_csasr recipe ( #1252 )
...
Co-authored-by: zss11 <zss11@d3-hpc-sjtu-test-001.cm.cluster>
2023-09-14 09:58:28 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release ( #1249 )
2023-09-13 12:39:49 +08:00
zr_jin
0f1bc6f8af
Multi_zh-Hans Recipe ( #1238 )
...
* Init commit for recipes trained on multiple zh datasets.
* fbank extraction for thchs30
* added support for aishell1
* added support for aishell-2
* fixes
* fixes
* fixes
* added support for stcmds and primewords
* fixes
* added support for magicdata
script for fbank computation not done yet
* added script for magicdata fbank computation
* file permission fixed
* updated for the wenetspeech recipe
* updated
* Update preprocess_kespeech.py
* updated
* updated
* updated
* updated
* file permission fixed
* updated paths
* fixes
* added support for kespeech dev/test set fbank computation
* fixes for file permission
* refined support for KeSpeech
* added scripts for BPE model training
* updated
* init commit for the multi_zh-cn zipformer recipe
* disable speed perturbation by default
* updated
* updated
* added necessary files for the zipformer recipe
* removed redundant wenetspeech M and S sets
* updates for multi dataset decoding
* refined
* formatting issues fixed
* updated
* minor fixes
* this commit finalize the recipe (hopefully)
* fixed formatting issues
* minor fixes
* updated
* using soft links to reduce redundancy
* minor updates
* using soft links to reduce redundancy
* minor updates
* minor updates
* using soft links to reduce redundancy
* minor updates
* Update README.md
* minor updates
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* minor updates
* minor fixes
* fixed a formatting issue
* Update preprocess_kespeech.py
* Update prepare.sh
* Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* removed redundant files
* symlinks added
* minor updates
* added CI tests for `multi_zh-hans`
* minor fixes
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-09-13 11:57:05 +08:00
zr_jin
d50a9ea030
doc str fixes ( #1241 )
2023-09-07 16:34:53 +08:00
zr_jin
9ef8145fa3
minor fixes ( #1240 )
2023-09-04 17:56:05 +08:00
Desh Raj
8fcadb68a7
Missing definitions in scaling.py added ( #1232 )
2023-08-31 10:31:05 +08:00
Wei Kang
4d7f73ce65
Add context biasing for zipformer recipe ( #1204 )
...
* Add context biasing for zipformer recipe
* support context biasing in modified_beam_search_LODR
* fix context graph
* Minor fixes
2023-08-28 19:37:32 +08:00
Fangjun Kuang
fc2df07841
Add icefall tutorials for dummies. ( #1220 )
2023-08-16 22:32:41 +08:00
Erwan Zerhouni
9a47c08d08
Update padding modified beam search ( #1217 )
2023-08-14 16:10:50 +02:00
Piotr Żelasko
b0e8a40c89
Speed up yesno training to finish in ~10s on CPU ( #1215 )
2023-08-13 09:50:59 +08:00
Fangjun Kuang
dfccadc6b6
Fix a typo in export_onnx.py for yesno ( #1213 )
2023-08-12 16:59:06 +08:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model ( #1162 )
2023-08-12 16:53:59 +08:00
Fangjun Kuang
d6b28a11a7
Add export script for the yesno recipe. ( #1212 )
2023-08-11 23:57:00 +08:00
zr_jin
74806b744b
disable speed perturbation by default ( #1176 )
...
* disable speed perturbation by default
* minor fixes
* minor updates
* updated bash scripts to incorporate with the `speed-perturb` arg
* minor fixes
1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe
>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)
2. changed arg type for `perturb-speed` to str2bool
2023-08-10 20:56:02 +08:00
Yifan Yang
00256a7669
Fix decode_stream.py ( #1208 )
...
* FIx decode_stream.py
* Update decode_stream.py
2023-08-09 09:40:58 +08:00
marcoyang1998
1ee251c8b3
Decode zipformer with external LMs ( #1193 )
...
* update some documentation
* support decoding with LMs in zipformer recipe
* update RESULTS.md
2023-08-03 15:50:35 +08:00
kobenaxie
80d922c158
Update preprocess_commonvoice.py to fix text normalization bug. ( #1181 )
2023-07-26 16:54:42 +08:00
Fangjun Kuang
1dbbd7759e
Add tests for subsample.py and fix typos ( #1180 )
2023-07-25 14:46:18 +08:00
zr_jin
4ab7d61008
removed batch_name
to fix a KeyError with "uttid" ( #1172 )
2023-07-15 12:39:32 +08:00
marcoyang1998
5ed6fc0e6d
add sym link ( #1170 )
2023-07-12 15:37:14 +08:00
Desh Raj
41b16d7838
SURT recipe for AMI and ICSI ( #1133 )
...
* merge upstream
* add SURT model and training
* add libricss decoding
* add chunk width randomization
* decode SURT with libricss
* initial commit for zipformer_ctc
* remove unwanted changes
* remove changes to other recipe
* fix zipformer softlink
* fix for JIT export
* add missing file
* fix symbolic links
* update results
* clean commit for SURT recipe
* training libricss surt model
* remove unwanted files
* remove unwanted changes
* remove changes in librispeech
* change some files to symlinks
* remove unwanted changes in utils
* add export script
* add README
* minor fix in README
* add assets for README
* replace some files with symlinks
* remove unused decoding methods
* initial commit for SURT AMI recipe
* fix symlink
* add train + decode scripts
* add missing symlink
* change files to symlink
* change file type
2023-07-08 23:01:51 +08:00
Yifan Yang
ffe816e2a8
Fix blank skip ci test ( #1167 )
...
* Fix for ci
* Fix frame_reducer
2023-07-06 23:12:41 +08:00
Fangjun Kuang
130ad0319d
Fix CI test for zipformer CTC ( #1165 )
2023-07-05 10:38:29 +08:00
Fangjun Kuang
b8a17944e4
Fix zipformer CI test ( #1164 )
2023-07-05 10:23:35 +08:00
Desh Raj
a4402b88e6
SURT multi-talker ASR recipe ( #1126 )
...
* merge upstream
* add SURT model and training
* add libricss decoding
* add chunk width randomization
* decode SURT with libricss
* initial commit for zipformer_ctc
* remove unwanted changes
* remove changes to other recipe
* fix zipformer softlink
* fix for JIT export
* add missing file
* fix symbolic links
* update results
* clean commit for SURT recipe
* training libricss surt model
* remove unwanted files
* remove unwanted changes
* remove changes in librispeech
* change some files to symlinks
* remove unwanted changes in utils
* add export script
* add README
* minor fix in README
* add assets for README
* replace some files with symlinks
* remove unused decoding methods
* fix symlink
* address comments from @csukuangfj
2023-07-04 19:25:58 +08:00
zr_jin
856c0f2a60
fixed default param for an aishell recipe ( #1159 )
2023-07-04 19:12:39 +08:00
Fangjun Kuang
9009d028a0
Fix ONNX export for the latest non-streaming zipformer. ( #1160 )
2023-07-03 23:56:51 +08:00
Fangjun Kuang
c3e23ec8d2
Fix logaddexp for ONNX export ( #1158 )
2023-07-02 10:30:09 +08:00
MicKot
98d89463f6
zipformer2 logaddexp onnx safe ( #1157 )
2023-06-30 21:16:40 +08:00
Zengwei Yao
ccd8c624dd
support testing onnx exported model on the test sets ( #1150 )
...
* support testing onnx exported model on the test sets
* use token_table instead
2023-06-30 12:05:37 +08:00
Desh Raj
c59c89fc13
Minor fix in tedlium results file ( #1153 )
2023-06-29 13:09:01 +02:00
Wei Kang
db71b03026
Support int8 quantization in decoder ( #1152 )
2023-06-29 16:48:59 +08:00
Desh Raj
9c2172c1c4
Zipformer for TedLium ( #1125 )
...
* initial commit for zipformer tedlium
* fix unk decoding
* add pretrained model and logs
* update for new AsrModel
* add option for choosing rnnt type
* add results with modified rnnt
2023-06-28 16:43:49 +08:00
Fangjun Kuang
968ebd236b
Fix ONNX export of the latest streaming zipformer model. ( #1148 )
2023-06-27 14:35:59 +08:00
Wei Kang
219bba1310
zipformer wenetspeech ( #1130 )
...
* copy files
* update train.py
* small fixes
* Add decode.py
* Fix dataloader in decode.py
* add blank penalty
* Add blank-penalty to other decoding method
* Minor fixes
* add zipformer2 recipe
* Minor fixes
* Remove pruned7
* export and test models
* Replace bpe with tokens in export.py and pretrain.py
* Minor fixes
* Minor fixes
* Minor fixes
* Fix export
* Update results
* Fix zipformer-ctc
* Fix ci
* Fix ci
* Fix CI
* Fix CI
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
frankyoujian
4d5b8369ae
fix small typo ( #1144 )
2023-06-21 17:17:19 +08:00
Yifan Yang
d667dc365b
Fix for diagnostic ( #1135 )
...
* CTC loss return tensor
* Update model.py
2023-06-16 15:04:41 +08:00
Yifan Yang
0a465794a8
Fix Zipformer ( #1132 )
...
* Update model.py
* Update train.py
* Update decoder.py
2023-06-15 17:52:14 +08:00
Fangjun Kuang
947f0614c9
Fix running exported model on GPU. ( #1131 )
2023-06-15 12:25:15 +08:00