zr_jin
ae67f75e9c
a bilingual recipe similar to the multi-zh_hans
( #1265 )
2023-11-26 10:04:15 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion ( #1386 )
2023-11-17 18:12:59 +08:00
zr_jin
f82bccfd63
Support CTC decoding for multi-zh_hans
recipe ( #1313 )
2023-10-24 19:04:09 +08:00
zr_jin
d76c3fe472
Migrate zipformer model to other Chinese datasets ( #1216 )
...
added zipformer recipe for AISHELL-1
2023-10-24 16:24:46 +08:00
Fangjun Kuang
902dc2364a
Update docker for torch 2.1 ( #1326 )
2023-10-22 23:25:06 +08:00
Yifan Yang
416852e8a1
Add Zipformer recipe for GigaSpeech ( #1254 )
...
Co-authored-by: Yifan Yang <yifanyeung@qq.com>
Co-authored-by: yfy62 <yfy62@d3-hpc-sjtu-test-005.cm.cluster>
2023-10-21 15:36:59 +08:00
zr_jin
82199b8fe1
Init commit for swbd ( #1146 )
2023-10-07 11:44:18 +08:00
Fangjun Kuang
109354b6b8
Add CTC HLG decoding for zipformer ( #1287 )
2023-10-02 14:00:06 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc ( #1279 )
2023-10-01 13:46:16 +08:00
Fangjun Kuang
772ee3955b
Support HLG decoding using OpenFst with kaldi decoders ( #1275 )
2023-09-27 14:49:27 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
zr_jin
0f1bc6f8af
Multi_zh-Hans Recipe ( #1238 )
...
* Init commit for recipes trained on multiple zh datasets.
* fbank extraction for thchs30
* added support for aishell1
* added support for aishell-2
* fixes
* fixes
* fixes
* added support for stcmds and primewords
* fixes
* added support for magicdata
script for fbank computation not done yet
* added script for magicdata fbank computation
* file permission fixed
* updated for the wenetspeech recipe
* updated
* Update preprocess_kespeech.py
* updated
* updated
* updated
* updated
* file permission fixed
* updated paths
* fixes
* added support for kespeech dev/test set fbank computation
* fixes for file permission
* refined support for KeSpeech
* added scripts for BPE model training
* updated
* init commit for the multi_zh-cn zipformer recipe
* disable speed perturbation by default
* updated
* updated
* added necessary files for the zipformer recipe
* removed redundant wenetspeech M and S sets
* updates for multi dataset decoding
* refined
* formatting issues fixed
* updated
* minor fixes
* this commit finalize the recipe (hopefully)
* fixed formatting issues
* minor fixes
* updated
* using soft links to reduce redundancy
* minor updates
* using soft links to reduce redundancy
* minor updates
* minor updates
* using soft links to reduce redundancy
* minor updates
* Update README.md
* minor updates
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* minor updates
* minor fixes
* fixed a formatting issue
* Update preprocess_kespeech.py
* Update prepare.sh
* Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* removed redundant files
* symlinks added
* minor updates
* added CI tests for `multi_zh-hans`
* minor fixes
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-09-13 11:57:05 +08:00
zr_jin
49a4b67288
fixed a CI test issue related to python version ( #1243 )
2023-09-07 19:48:46 +08:00
zr_jin
c912bd65d0
Update run-gigaspeech-pruned-transducer-stateless2-2022-05-12.sh ( #1242 )
2023-09-07 18:48:27 +08:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model ( #1162 )
2023-08-12 16:53:59 +08:00
Fangjun Kuang
d6b28a11a7
Add export script for the yesno recipe. ( #1212 )
2023-08-11 23:57:00 +08:00
Fangjun Kuang
375520d419
Run the yesno recipe with docker in GitHub actions ( #1191 )
2023-07-28 15:43:08 +08:00
Fangjun Kuang
751bb6ff1a
Add docker image for icefall ( #1189 )
2023-07-28 10:34:40 +08:00
Fangjun Kuang
1dbbd7759e
Add tests for subsample.py and fix typos ( #1180 )
2023-07-25 14:46:18 +08:00
Yifan Yang
ffe816e2a8
Fix blank skip ci test ( #1167 )
...
* Fix for ci
* Fix frame_reducer
2023-07-06 23:12:41 +08:00
Fangjun Kuang
6fd674312c
Fix failed CI tests ( #1166 )
2023-07-05 10:52:34 +08:00
Wei Kang
219bba1310
zipformer wenetspeech ( #1130 )
...
* copy files
* update train.py
* small fixes
* Add decode.py
* Fix dataloader in decode.py
* add blank penalty
* Add blank-penalty to other decoding method
* Minor fixes
* add zipformer2 recipe
* Minor fixes
* Remove pruned7
* export and test models
* Replace bpe with tokens in export.py and pretrain.py
* Minor fixes
* Minor fixes
* Minor fixes
* Fix export
* Update results
* Fix zipformer-ctc
* Fix ci
* Fix ci
* Fix CI
* Fix CI
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
Zengwei Yao
0ad037d076
Add CTC loss option in zipformer recipe ( #1111 )
...
* add CTC loss option in zipformer recipe
* add ctc_decode.py
* support CTC model export, add jit_pretrained_ctc.py, pretrained_ctc.py
* update README.md and RESULTS.md
* add CI test
2023-06-14 14:27:29 +08:00
Yifan Yang
7c4ff66a3d
Fix yesno Cl test ( #1078 )
2023-05-22 12:46:43 +08:00
Fangjun Kuang
3883e362ad
Fix yesno CI test ( #1077 )
2023-05-22 12:29:51 +08:00
Zengwei Yao
f18b539fbc
Add the upgraded Zipformer model ( #1058 )
...
* add the zipformer codes, copied from branch from_dan_scaled_adam_exp1119
* support model export with torch.jit.script
* update RESULTS.md
* support exporting streaming model with torch.jit.script
* add results of streaming models, with some minor changes
* update README.md
* add CI test
* update k2 version in requirements-ci.txt
* update pyproject.toml
2023-05-19 16:47:59 +08:00
marcoyang1998
34d1b07c3d
Modified beam search with RNNLM rescoring ( #1002 )
...
* add RNNLM rescore
* add shallow fusion and lm rescore for streaming zipformer
* minor fix
* update RESULTS.md
* fix yesno workflow, change from ubuntu-18.04 to ubuntu-latest
2023-04-17 16:43:00 +08:00
Yifan Yang
a48812ddb3
Ban the test_rnn.py in ci-test ( #949 )
2023-03-15 22:02:20 +08:00
Yifan Yang
28af269e5e
Fix for workflow ( #934 )
2023-03-09 17:38:15 +08:00
Fangjun Kuang
c01175679e
Add CI test for exporting csj pretrained zipformer to ncnn ( #913 )
2023-02-16 21:09:05 +08:00
Fangjun Kuang
c5e687ddf5
Export streaming zipformer to ncnn ( #906 )
2023-02-13 23:41:43 +08:00
Fangjun Kuang
2b995639b7
Add ONNX support for Zipformer and ConvEmformer ( #884 )
2023-02-09 00:02:38 +08:00
Fangjun Kuang
7ae03f6c88
Add onnx export support for pruned_transducer_stateless5 ( #883 )
2023-02-07 17:47:08 +08:00
Fangjun Kuang
8d3810e289
Simplify ONNX export ( #881 )
...
* Simplify ONNX export
* Fix ONNX CI tests
2023-02-07 15:01:59 +08:00
Fangjun Kuang
52f3a747be
Refactor onnx export for streaming zipformer ( #879 )
2023-02-07 12:12:26 +08:00
Yuekai Zhang
bf5f0342a2
Add streaming onnx export for zipformer ( #831 )
...
* add streaming onnx export for zipformer
* update triton support
* add comments
* add ci test
* add onnxmltools for fp16 onnx export
2023-02-06 10:37:07 +08:00
Yunusemre
0f26edfde9
Add Zipformer Onnx Support ( #778 )
...
* add export script
* add zipformer onnx pretrained script
* add onnx zipformer test
* fix style
* add zipformer onnx to workflow
* replace is_in_onnx_export with is_tracing
* add github.event.label.name == 'onnx'
* add is_tracing to necessary conditions
* fix pooling_mask
* add onnx_check
* add onnx_check to scripts
* add is_tracing to scaling.py
2023-01-03 16:59:44 +08:00
Zengwei Yao
d167aad4ab
Add streaming zipformer ( #787 )
...
* add streaming zipformer codes
* add test_model.py
* add export.py, pretrained.py, jit_pretrained.py
* add cached_len for pooling module
* add jit_trace_export.py and jit_trace_pretrained.py
* fix bug in jit.trace
* update RESULTS.md
* add CI test
* minor fix in pruned_transducer_stateless7/zipformer.py
* update README.md
2022-12-30 10:52:18 +08:00
marcoyang1998
1f0408b103
Support Transformer LM ( #750 )
...
* support transformer LM
* show number of parameters during training
* update docstring
* testing files for ppl calculation
* add lm wrampper for rnn and transformer LM
* apply lm wrapper in lm shallow fusion
* small updates
* update decode.py to support LM fusion and LODR
* add export.py
* update CI and workflow
* update decoding results
* fix CI
* remove transformer LM from CI test
2022-12-29 10:53:36 +08:00
Yifan Yang
070c77e724
Add Blankskip to Zipformer+CTC ( #730 )
...
* init files
* add ctc as auxiliary loss and ctc_decode.py
* tuning the scalar of HLG score for 1best, nbest and nbest-oracle
* rename to pruned_transducer_stateless7_ctc
* fix doc
* fix bug, recover the hlg scores
* modify ctc_decode.py, move out the hlg scale
* fix hlg_scale
* add export.py and pretrained.py, and so on
* upload files, update README.md and RESULTS.md
* add CI test
* update .gitignore
* create symlinks
* Add Blank Skip to Zipformer+CTC
* Add warmup to blank skip
* Add warmup to blank skip
* Add __init__.py
* Add parameters_names to Adam
* Add warmup to blank skip
* Modify frame_reducer
* Modify frame_reducer
* Add Blank Skip to decode.
* Add ctc_decode.py
* Add blank skip to Zipformer+CTC
* process conflict
* process conflict
* modify ctc_guild_decode_bk.py
* modify Lconv
* produce the conflict
* Add export.py
* finish export
* fix for running black
* Add ci test
* Add ci-test
* chmod
* chmod
* fix bug for ci-test
* fix bug for ci-test
* fix bug for ci-test
* rename the dirname
* rename the dirname
* change dirname
* change dirname
* fix notes
* add pretrained.py
* add pretrained.py
* add pretrained.py
* add pretrained.py
* add pretrained.py
* add pretrained.py
* fix
* fix
* fix
* finished
* add the Copyright info and notes
Co-authored-by: Zengwei Yao <yaozengwei@outlook.com>
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-21 17:41:31 +08:00
Zengwei Yao
0470bbae66
minor fix for zipformer recipe ( #758 )
...
* minor fix
* add CI test
2022-12-13 15:47:30 +08:00
Zengwei Yao
b25c234c51
Add Zipformer-MMI ( #746 )
...
* Minor fix to conformer-mmi
* Minor fixes
* Fix decode.py
* add training files
* train with ctc warmup
* add pruned_transducer_stateless7_mmi
* add zipformer_mmi/mmi_decode.py, using HP as decoding graph
* add mmi_decode.py
* remove pruned_transducer_stateless7_mmi
* rename zipformer_mmi/train_with_ctc.py as zipformer_mmi/train.py
* remove unused method
* rename mmi_decode.py
* add export.py pretrained.py jit_pretrained.py ...
* add RESULTS.md
* add CI test
* add docs
* add README.md
Co-authored-by: pkufool <wkang.pku@gmail.com>
2022-12-11 21:30:39 +08:00
Fangjun Kuang
f13cf61b05
Convert conv-emformer to ncnn ( #717 )
...
* Export conv-emformer via torch.jit.trace()
2022-12-06 16:34:27 +08:00
Zengwei Yao
8eb4b9d96d
Combining rnnt loss and k2-ctc loss for Dan's Zipformer ( #683 )
...
* init files
* add ctc as auxiliary loss and ctc_decode.py
* tuning the scalar of HLG score for 1best, nbest and nbest-oracle
* rename to pruned_transducer_stateless7_ctc
* fix doc
* fix bug, recover the hlg scores
* modify ctc_decode.py, move out the hlg scale
* fix hlg_scale
* add export.py and pretrained.py, and so on
* upload files, update README.md and RESULTS.md
* add CI test
2022-12-03 19:01:10 +08:00
Fangjun Kuang
6533f359c9
Fix CI ( #726 )
...
* Fix CI
* Disable shuffle for yesno.
See https://github.com/k2-fsa/icefall/issues/197
2022-12-02 10:53:06 +08:00
Fangjun Kuang
2bca7032af
Update RNNLM training scripts ( #720 )
...
* Update RNNLM training scripts
* Fix a typo
* Fix CI
2022-12-01 15:57:43 +08:00
marcoyang1998
4b5bc480e8
Add low-order density ratio in RNNLM shallow fusion ( #678 )
...
* Support LODR in RNNLM shallow fusion
* fix style
* fix code style
* update workflow and CI
* update results
* propagate changes to stateless3
* add decoding results for stateless3+giga
* fix CI
2022-11-30 17:26:05 +08:00
Zengwei Yao
ece728d895
Apply delay penalty on k2 ctc loss ( #669 )
...
* add init files
* fix bug, apply delay penalty
* fix decoding code and getting timestamps
* add option applying delay penalty on ctc log-prob
* fix bug of streaming decoding
* minor change for bpe-based case
* add test_model.py
* add README.md
* add CI
2022-11-28 22:34:02 +08:00
Desh Raj
107df3b115
apply black on all files
2022-11-17 09:42:17 -05:00
Fangjun Kuang
60317120ca
Revert "Apply new Black style changes"
2022-11-17 20:19:32 +08:00