zr_jin
4d047dc8b8
Merge c2cb70fc22ffd0a9cb8cbe107846ef3441a7d39c into d9ae8c02a0abdeddc5a4cf9fad72293eda134de3
2024-02-10 04:49:39 -07:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler ( #1468 )
...
* Fix buffer size
* Fix for flake8
---------
Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. ( #1447 )
...
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
jinzr
c2cb70fc22
Create generate_unique_lexicon.py
2023-12-20 18:58:40 +08:00
jinzr
2a1877486e
Create convert_transcript_words_to_tokens.py
2023-12-20 16:51:45 +08:00
jinzr
ecfbd090af
Delete convert_transcript_words_to_tokens.py
2023-12-20 15:05:57 +08:00
jinzr
6097d7363d
Create convert_transcript_words_to_tokens.py
2023-12-20 11:31:06 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. ( #1415 )
2023-12-13 17:34:12 +08:00
jinzr
39a02f7c30
added blank penalty
2023-11-17 17:06:23 +08:00
jinzr
a37408f663
Revert "Update decode.py"
...
This reverts commit 73e1237c2d5842ab0b0d3b5ab474c948fd8ff019.
2023-11-09 11:57:49 +08:00
jinzr
73e1237c2d
Update decode.py
2023-11-09 11:50:39 +08:00
jinzr
16499a5ef6
Update decode.py
2023-11-09 11:37:18 +08:00
zr_jin
fb541ec60c
Merge branch 'k2-fsa:master' into dev/lm_multi_zh-hans
2023-11-09 11:08:28 +08:00
jinzr
b4d91d24ac
Update asr_datamodule.py
2023-11-09 11:02:36 +08:00
jinzr
7bd260fb5a
Update decode.py
2023-11-09 11:01:21 +08:00
jinzr
852f5a6153
isort formatted
2023-11-09 10:56:48 +08:00
JinZr
de3daf6496
Merge branch 'dev/lm_multi_zh-hans' of https://github.com/JinZr/icefall into dev/lm_multi_zh-hans
2023-11-09 10:53:05 +08:00
JinZr
91da99ff52
updated
2023-11-09 10:51:41 +08:00
jinzr
8d20337d8a
Update decode.py
2023-11-09 10:45:22 +08:00
jinzr
4c4c26fbb7
Update decode.py
2023-11-09 10:40:33 +08:00
jinzr
3694e419fb
Update prepare_lm_training_data.py
2023-11-08 11:52:01 +08:00
jinzr
c54fdf9ff9
Update prepare_lm_data.sh
2023-11-08 11:42:46 +08:00
jinzr
3f89cb380a
minor updates
2023-11-08 11:36:36 +08:00
jinzr
817413f899
minor updates
2023-11-08 10:53:34 +08:00
jinzr
d29efb7345
Update prepare_lm_training_data.py
2023-11-08 10:20:56 +08:00
jinzr
403e2e52ac
Update prepare_lm_training_data.py
2023-11-08 10:20:10 +08:00
jinzr
7f53f59776
Update prepare_lm_training_data.py
2023-11-08 10:14:08 +08:00
jinzr
86c3dbec0e
Update prepare_lm_training_data.py
2023-11-08 10:07:32 +08:00
jinzr
94f963baf8
Update prepare_lm_training_data.py
2023-11-08 10:05:29 +08:00
jinzr
1a11440014
minor updates
2023-11-08 09:57:57 +08:00
zr_jin
770c495484
minor fixes in the CTC decoding code ( #1338 )
2023-10-25 17:14:17 +08:00
zr_jin
f82bccfd63
Support CTC decoding for multi-zh_hans
recipe ( #1313 )
2023-10-24 19:04:09 +08:00
jinzr
a006382941
Create prepare_lm_data.sh
2023-10-23 13:29:31 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse ( #1314 )
2023-10-17 21:22:32 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions
( #1269 )
...
* fixes for `diagnostics`
Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`
also black formatted some scripts
* fixed formatting issues
2023-09-24 17:06:47 +08:00
Tiance Wang
7e1288af50
fix thchs-30 download command ( #1260 )
2023-09-19 16:46:36 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release ( #1249 )
2023-09-13 12:39:49 +08:00
zr_jin
0f1bc6f8af
Multi_zh-Hans Recipe ( #1238 )
...
* Init commit for recipes trained on multiple zh datasets.
* fbank extraction for thchs30
* added support for aishell1
* added support for aishell-2
* fixes
* fixes
* fixes
* added support for stcmds and primewords
* fixes
* added support for magicdata
script for fbank computation not done yet
* added script for magicdata fbank computation
* file permission fixed
* updated for the wenetspeech recipe
* updated
* Update preprocess_kespeech.py
* updated
* updated
* updated
* updated
* file permission fixed
* updated paths
* fixes
* added support for kespeech dev/test set fbank computation
* fixes for file permission
* refined support for KeSpeech
* added scripts for BPE model training
* updated
* init commit for the multi_zh-cn zipformer recipe
* disable speed perturbation by default
* updated
* updated
* added necessary files for the zipformer recipe
* removed redundant wenetspeech M and S sets
* updates for multi dataset decoding
* refined
* formatting issues fixed
* updated
* minor fixes
* this commit finalize the recipe (hopefully)
* fixed formatting issues
* minor fixes
* updated
* using soft links to reduce redundancy
* minor updates
* using soft links to reduce redundancy
* minor updates
* minor updates
* using soft links to reduce redundancy
* minor updates
* Update README.md
* minor updates
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_magicdata.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_stcmds.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/compute_fbank_primewords.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* minor updates
* minor fixes
* fixed a formatting issue
* Update preprocess_kespeech.py
* Update prepare.sh
* Update egs/multi_zh-hans/ASR/local/compute_fbank_kespeech_splits.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/multi_zh-hans/ASR/local/preprocess_kespeech.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* removed redundant files
* symlinks added
* minor updates
* added CI tests for `multi_zh-hans`
* minor fixes
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
* Update run-multi-zh_hans-zipformer.sh
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-09-13 11:57:05 +08:00