821 Commits

Author SHA1 Message Date
jinzr
6097d7363d Create convert_transcript_words_to_tokens.py 2023-12-20 11:31:06 +08:00
jinzr
39a02f7c30 added blank penalty 2023-11-17 17:06:23 +08:00
zr_jin
b6bcd4dcf4
Merge branch 'k2-fsa:master' into dev/lm_multi_zh-hans 2023-11-10 17:22:08 +08:00
lishaojie
1b2e99d374
add the pruned_transducer_stateless7_streaming recipe for commonvoice (#1018)
* add the pruned_transducer_stateless7_streaming recipe for commonvoice

* fix the symlinks

* Update RESULTS.md
2023-11-09 22:07:28 +08:00
jinzr
a37408f663 Revert "Update decode.py"
This reverts commit 73e1237c2d5842ab0b0d3b5ab474c948fd8ff019.
2023-11-09 11:57:49 +08:00
jinzr
73e1237c2d Update decode.py 2023-11-09 11:50:39 +08:00
jinzr
16499a5ef6 Update decode.py 2023-11-09 11:37:18 +08:00
zr_jin
fb541ec60c
Merge branch 'k2-fsa:master' into dev/lm_multi_zh-hans 2023-11-09 11:08:28 +08:00
jinzr
b4d91d24ac Update asr_datamodule.py 2023-11-09 11:02:36 +08:00
jinzr
7bd260fb5a Update decode.py 2023-11-09 11:01:21 +08:00
jinzr
852f5a6153 isort formatted 2023-11-09 10:56:48 +08:00
JinZr
de3daf6496 Merge branch 'dev/lm_multi_zh-hans' of https://github.com/JinZr/icefall into dev/lm_multi_zh-hans 2023-11-09 10:53:05 +08:00
JinZr
91da99ff52 updated 2023-11-09 10:51:41 +08:00
jinzr
8d20337d8a Update decode.py 2023-11-09 10:45:22 +08:00
jinzr
4c4c26fbb7 Update decode.py 2023-11-09 10:40:33 +08:00
jinzr
3694e419fb Update prepare_lm_training_data.py 2023-11-08 11:52:01 +08:00
jinzr
c54fdf9ff9 Update prepare_lm_data.sh 2023-11-08 11:42:46 +08:00
jinzr
aead3e0c65 Update sort_lm_training_data.py 2023-11-08 11:42:28 +08:00
jinzr
3f89cb380a minor updates 2023-11-08 11:36:36 +08:00
jinzr
817413f899 minor updates 2023-11-08 10:53:34 +08:00
jinzr
d29efb7345 Update prepare_lm_training_data.py 2023-11-08 10:20:56 +08:00
jinzr
403e2e52ac Update prepare_lm_training_data.py 2023-11-08 10:20:10 +08:00
jinzr
7f53f59776 Update prepare_lm_training_data.py 2023-11-08 10:14:08 +08:00
jinzr
86c3dbec0e Update prepare_lm_training_data.py 2023-11-08 10:07:32 +08:00
jinzr
94f963baf8 Update prepare_lm_training_data.py 2023-11-08 10:05:29 +08:00
jinzr
1a11440014 minor updates 2023-11-08 09:57:57 +08:00
zr_jin
231bbcd2b6
Update optim.py (#1366) 2023-11-03 12:06:29 +08:00
wnywbyt
c3bbb32f9e
Update the parameter 'vocab-size' (#1364)
Co-authored-by: wdq <dongqin.wan@desaysv.com>
2023-11-02 20:45:30 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py (#1359)
* init commit

* black formatted

* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc (#848)
* initial commit

* update readme

* Update README.md

* change bool to str2bool for arg parser

* run validation only at the end of epoch

* black format

* black format
2023-10-30 12:09:39 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc (#941)
* merge upstream

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* Update RESULTS.md

Address comments from @csukuangfj

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
zr_jin
ea78b32857
minor fixes (#1345) 2023-10-27 13:35:43 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
770c495484
minor fixes in the CTC decoding code (#1338) 2023-10-25 17:14:17 +08:00
zr_jin
dcbc7a63e1
Update train-rnn-lm.sh (#1337) 2023-10-25 12:50:35 +08:00
zr_jin
1814bbb0e7
typo fixed (#1334) 2023-10-25 00:03:33 +08:00
zr_jin
f82bccfd63
Support CTC decoding for multi-zh_hans recipe (#1313) 2023-10-24 19:04:09 +08:00
zr_jin
d76c3fe472
Migrate zipformer model to other Chinese datasets (#1216)
added zipformer recipe for AISHELL-1
2023-10-24 16:24:46 +08:00
zr_jin
f9980aa606
minor fixes (#1332) 2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support (#1329) 2023-10-24 01:10:50 +08:00
jinzr
a006382941 Create prepare_lm_data.sh 2023-10-23 13:29:31 +08:00
Yifan Yang
416852e8a1
Add Zipformer recipe for GigaSpeech (#1254)
Co-authored-by: Yifan Yang <yifanyeung@qq.com>
Co-authored-by: yfy62 <yfy62@d3-hpc-sjtu-test-005.cm.cluster>
2023-10-21 15:36:59 +08:00
Rudra
eef47adee9
fix typo (#1324) 2023-10-19 22:54:43 +08:00
Karel Vesely
543b4cc1ca
small enhanecements (#1322)
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
  errs)
2023-10-19 21:53:31 +08:00
marcoyang1998
ce372cce33
Update documentation to PromptASR (#1321) 2023-10-19 17:24:31 +08:00
marcoyang1998
52c24df61d
Fix model avg (#1317)
* fix a bug about the model_avg during finetuning by exchanging the order of loading pre-trained model and initializing avg model

* only match the exact module prefix
2023-10-18 17:36:14 +08:00
Erwan Zerhouni
807816fec0
Fix chunk issue for sherpa (#1316) 2023-10-18 16:07:10 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse (#1314) 2023-10-17 21:22:32 +08:00
zr_jin
1ef349d120
[WIP] AISHELL-1 pruned transducer stateless7 streaming recipe (#1300)
* `pruned_transudcer_stateless7_streaming` for AISHELL-1

* Update train.py

* Update train2.py

* Update decode.py

* Update RESULTS.md
2023-10-16 16:28:16 +08:00