1017 Commits

Author SHA1 Message Date
jinzr
4c25b502e2 Merge branch 'master' into dev/bilingual 2023-11-23 17:27:08 +08:00
jinzr
de71456f9d Update run-multi-corpora-zipformer.sh 2023-11-23 17:01:50 +08:00
jinzr
2c5b9b5ede added CI test 2023-11-23 17:00:18 +08:00
jinzr
701835e3fc Update RESULTS.md 2023-11-23 16:04:07 +08:00
jinzr
4209f60832 Update prepare.sh 2023-11-23 14:19:44 +08:00
jinzr
27df51521f Update multi_dataset.py 2023-11-23 11:23:36 +08:00
jinzr
a8d51dd0b0 Delete train_bpe_model.py 2023-11-23 11:21:44 +08:00
jinzr
d15605f660 Update prepare_for_bpe_model.py 2023-11-23 11:20:37 +08:00
jinzr
7e798c21c7 Merge branch 'dev/bilingual' of https://github.com/JinZr/icefall into dev/bilingual 2023-11-23 11:18:51 +08:00
jinzr
d6236ac395 Create prepare_char.py 2023-11-23 11:05:06 +08:00
jinzr
0bc98f1421 Create prepare_lang_bbpe.py 2023-11-23 11:05:04 +08:00
jinzr
18b1a872b1 Delete prepare_lang_bbpe.py 2023-11-23 11:04:56 +08:00
zr_jin
24885dc071
Update egs/multi_zh_en/ASR/local/prepare_for_bpe_model.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-11-23 11:03:47 +08:00
jinzr
a3f6a410f3 Delete prepare_char.py 2023-11-23 11:03:16 +08:00
Wei Kang
238b45bea8
Libriheavy recipe (zipformer) (#1261)
* initial commit for libriheavy

* Data prepare pipeline

* Fix train.py

* Fix decode.py

* Add results

* minor fixes

* black

* black

* Incorporate PR https://github.com/k2-fsa/icefall/pull/1269

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-11-23 01:22:57 +08:00
jinzr
adce588c49 minor updates 2023-11-22 17:41:00 +08:00
JinZr
5074520b88 Merge branch 'dev/bilingual' of https://github.com/jinzr/icefall into dev/bilingual 2023-11-22 17:26:32 +08:00
JinZr
cd20e21552 minor updates on flags 2023-11-22 17:24:21 +08:00
jinzr
428579e3ac minor updates 2023-11-22 17:11:19 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion (#1386) 2023-11-17 18:12:59 +08:00
jinzr
fe35141e7e Update train.py 2023-11-17 17:10:04 +08:00
zr_jin
9bfaff59e3
Merge branch 'k2-fsa:master' into dev/bilingual 2023-11-17 17:07:22 +08:00
Karel Vesely
59c943878f
add the voxpopuli recipe (#1374)
* add the `voxpopuli` recipe

- this is the data preparation
- there is no ASR training and no results

* update the PR#1374 (feedback from @csukuangfj)

- fixing .py headers and docstrings
- removing BUT specific parts of `prepare.sh`
- adding assert `num_jobs >= num_workers` to `compute_fbank.py`
- narrowing list of languages
  (let's limit to ASR sets with transcripts for now)
- added links to `README.md`
- extending `text_from_manifest.py`
2023-11-16 14:38:31 +08:00
zr_jin
6d275ddf9f
fixed broken softlinks (#1381)
* removed broken softlinks

* fixed dependencies

* fixed file permission
2023-11-10 14:45:16 +08:00
lishaojie
1b2e99d374
add the pruned_transducer_stateless7_streaming recipe for commonvoice (#1018)
* add the pruned_transducer_stateless7_streaming recipe for commonvoice

* fix the symlinks

* Update RESULTS.md
2023-11-09 22:07:28 +08:00
zr_jin
231bbcd2b6
Update optim.py (#1366) 2023-11-03 12:06:29 +08:00
wnywbyt
c3bbb32f9e
Update the parameter 'vocab-size' (#1364)
Co-authored-by: wdq <dongqin.wan@desaysv.com>
2023-11-02 20:45:30 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py (#1359)
* init commit

* black formatted

* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc (#848)
* initial commit

* update readme

* Update README.md

* change bool to str2bool for arg parser

* run validation only at the end of epoch

* black format

* black format
2023-10-30 12:09:39 +08:00
Himanshu Kumar Mahto
161ab90dfb
Enhancing the contributing.md file (#1351) 2023-10-30 09:07:42 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc (#941)
* merge upstream

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* Update RESULTS.md

Address comments from @csukuangfj

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Shreyas0410
5cebecf2dc
updated broken link in read.me file (#1342) 2023-10-27 13:36:15 +08:00
zr_jin
ea78b32857
minor fixes (#1345) 2023-10-27 13:35:43 +08:00
hairyputtar
800bf4b6a2
fix more typos (#1340)
* fix more typos

* fix typo

* fix typo

* fix typo
2023-10-27 11:46:28 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
770c495484
minor fixes in the CTC decoding code (#1338) 2023-10-25 17:14:17 +08:00
zr_jin
dcbc7a63e1
Update train-rnn-lm.sh (#1337) 2023-10-25 12:50:35 +08:00
zr_jin
1814bbb0e7
typo fixed (#1334) 2023-10-25 00:03:33 +08:00
zr_jin
f82bccfd63
Support CTC decoding for multi-zh_hans recipe (#1313) 2023-10-24 19:04:09 +08:00
zr_jin
d76c3fe472
Migrate zipformer model to other Chinese datasets (#1216)
added zipformer recipe for AISHELL-1
2023-10-24 16:24:46 +08:00
hairyputtar
3fb99400cf
fix typos (#1336)
* fix typo

* fix typo

* Update pruned_transducer_stateless.rst
2023-10-24 15:47:25 +08:00
Fangjun Kuang
4b791ced78
Fix CI tests (#1333) 2023-10-24 10:38:56 +08:00
zr_jin
f9980aa606
minor fixes (#1332) 2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support (#1329) 2023-10-24 01:10:50 +08:00
Fangjun Kuang
902dc2364a
Update docker for torch 2.1 (#1326) 2023-10-22 23:25:06 +08:00
Yifan Yang
416852e8a1
Add Zipformer recipe for GigaSpeech (#1254)
Co-authored-by: Yifan Yang <yifanyeung@qq.com>
Co-authored-by: yfy62 <yfy62@d3-hpc-sjtu-test-005.cm.cluster>
2023-10-21 15:36:59 +08:00
Rudra
eef47adee9
fix typo (#1324) 2023-10-19 22:54:43 +08:00
Daniel Povey
973dc1026d
Make diagnostics.py more error-tolerant and have wider range of supported torch versions (#1234) 2023-10-19 22:54:00 +08:00