1094 Commits

Author SHA1 Message Date
Fangjun Kuang
140e6381ad
Refactor CI tests for librispeech (#1436) 2023-12-27 13:21:14 +08:00
Fangjun Kuang
db52fe2349
Refactor CI test for aishell (#1435) 2023-12-26 20:29:43 +08:00
Fangjun Kuang
835a92eba5
Add doc about how to use the CPU-only docker images (#1432) 2023-12-25 20:23:56 +08:00
Ali Haznedaroğlu
ddd7131317
Update TTS export-onnx.py scripts for handling variable token counts (#1430) 2023-12-25 19:44:07 +08:00
Fangjun Kuang
c855a58cfd
Generate the dependency matrix by code for GitHub Actions (#1431) 2023-12-25 19:41:09 +08:00
Fangjun Kuang
e5bb1ae86c
Use the CPU docker in CI to simplify the test code (#1427) 2023-12-24 13:40:33 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py (#1424) 2023-12-23 00:38:36 +08:00
TianHao Zhang
702d4f5914
Update prepare.sh (#1422)
fix the bug in line 251:
1、 del the additional blank
2、correct the spell error of "new_vocab_size"
2023-12-21 14:42:33 +08:00
zr_jin
10a234709c
bugs fixed (#1416) 2023-12-14 11:26:37 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. (#1415) 2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder (#1413)
* Create export-onnx-streaming-ctc.py

* doc_str updated

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
9e9fe7954d
Upload gigaspeech zipformer models in CI (#1412) 2023-12-12 18:57:04 +08:00
Fangjun Kuang
20a82c9abf
first commit (#1411) 2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 (#1410) 2023-12-10 11:38:39 +08:00
zr_jin
df56aff31e
minor fixes to the vits onnx exportation scripts (#1408) 2023-12-08 21:11:31 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. (#1407) 2023-12-08 14:29:24 +08:00
zr_jin
bda72f86ff
minor adjustments to the VITS recipes for onnx runtime (#1405) 2023-12-08 06:32:40 +08:00
Yifan Yang
b87ed26c09
Normalize dockerfile (#1400) 2023-12-06 14:33:45 +08:00
zr_jin
735fb9a73d
A TTS recipe VITS on VCTK dataset (#1380)
* init

* isort formatted

* minor updates

* Create shared

* Update prepare_tokens_vctk.py

* Update prepare_tokens_vctk.py

* Update prepare_tokens_vctk.py

* Update prepare.sh

* updated

* Update train.py

* Update train.py

* Update tts_datamodule.py

* Update train.py

* Update train.py

* Update train.py

* Update train.py

* Update train.py

* Update train.py

* fixed formatting issue

* Update infer.py

* removed redundant files

* Create monotonic_align

* removed redundant files

* created symlinks

* Update prepare.sh

* minor adjustments

* Create requirements_tts.txt

* Update requirements_tts.txt

added version constraints

* Update infer.py

* Update infer.py

* Update infer.py

* updated docs

* Update export-onnx.py

* Update export-onnx.py

* Update test_onnx.py

* updated requirements.txt

* Update test_onnx.py

* Update test_onnx.py

* docs updated

* docs fixed

* minor updates
2023-12-06 09:59:19 +08:00
LoganLiu66
f08af2fa22
fix initial states (#1398)
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Zengwei Yao
0622dea30d
Add a TTS recipe VITS on LJSpeech dataset (#1372)
* first commit

* replace phonimizer with g2p

* use Conformer as text encoder

* modify training script, clean codes

* rename directory

* convert text to tokens in data preparation stage

* fix tts_datamodule.py

* support onnx export and testing the exported onnx model

* add doc

* add README.md

* fix style
2023-11-29 21:28:38 +08:00
zr_jin
ae67f75e9c
a bilingual recipe similar to the multi-zh_hans (#1265) 2023-11-26 10:04:15 +08:00
Wei Kang
238b45bea8
Libriheavy recipe (zipformer) (#1261)
* initial commit for libriheavy

* Data prepare pipeline

* Fix train.py

* Fix decode.py

* Add results

* minor fixes

* black

* black

* Incorporate PR https://github.com/k2-fsa/icefall/pull/1269

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-11-23 01:22:57 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion (#1386) 2023-11-17 18:12:59 +08:00
Karel Vesely
59c943878f
add the voxpopuli recipe (#1374)
* add the `voxpopuli` recipe

- this is the data preparation
- there is no ASR training and no results

* update the PR#1374 (feedback from @csukuangfj)

- fixing .py headers and docstrings
- removing BUT specific parts of `prepare.sh`
- adding assert `num_jobs >= num_workers` to `compute_fbank.py`
- narrowing list of languages
  (let's limit to ASR sets with transcripts for now)
- added links to `README.md`
- extending `text_from_manifest.py`
2023-11-16 14:38:31 +08:00
zr_jin
6d275ddf9f
fixed broken softlinks (#1381)
* removed broken softlinks

* fixed dependencies

* fixed file permission
2023-11-10 14:45:16 +08:00
lishaojie
1b2e99d374
add the pruned_transducer_stateless7_streaming recipe for commonvoice (#1018)
* add the pruned_transducer_stateless7_streaming recipe for commonvoice

* fix the symlinks

* Update RESULTS.md
2023-11-09 22:07:28 +08:00
zr_jin
231bbcd2b6
Update optim.py (#1366) 2023-11-03 12:06:29 +08:00
wnywbyt
c3bbb32f9e
Update the parameter 'vocab-size' (#1364)
Co-authored-by: wdq <dongqin.wan@desaysv.com>
2023-11-02 20:45:30 +08:00
zr_jin
9e5a5d7839
Incorporate some latest changes to optim.py (#1359)
* init commit

* black formatted

* isort formatted
2023-11-02 16:10:08 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Tiance Wang
c970df512b
New recipe: tiny_transducer_ctc (#848)
* initial commit

* update readme

* Update README.md

* change bool to str2bool for arg parser

* run validation only at the end of epoch

* black format

* black format
2023-10-30 12:09:39 +08:00
Himanshu Kumar Mahto
161ab90dfb
Enhancing the contributing.md file (#1351) 2023-10-30 09:07:42 +08:00
Desh Raj
7d56685734
[recipe] LibriSpeech zipformer_ctc (#941)
* merge upstream

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* Update RESULTS.md

Address comments from @csukuangfj

---------

Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-10-27 13:38:09 +08:00
Shreyas0410
5cebecf2dc
updated broken link in read.me file (#1342) 2023-10-27 13:36:15 +08:00
zr_jin
ea78b32857
minor fixes (#1345) 2023-10-27 13:35:43 +08:00
hairyputtar
800bf4b6a2
fix more typos (#1340)
* fix more typos

* fix typo

* fix typo

* fix typo
2023-10-27 11:46:28 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
zr_jin
770c495484
minor fixes in the CTC decoding code (#1338) 2023-10-25 17:14:17 +08:00
zr_jin
dcbc7a63e1
Update train-rnn-lm.sh (#1337) 2023-10-25 12:50:35 +08:00
zr_jin
1814bbb0e7
typo fixed (#1334) 2023-10-25 00:03:33 +08:00
zr_jin
f82bccfd63
Support CTC decoding for multi-zh_hans recipe (#1313) 2023-10-24 19:04:09 +08:00
zr_jin
d76c3fe472
Migrate zipformer model to other Chinese datasets (#1216)
added zipformer recipe for AISHELL-1
2023-10-24 16:24:46 +08:00
hairyputtar
3fb99400cf
fix typos (#1336)
* fix typo

* fix typo

* Update pruned_transducer_stateless.rst
2023-10-24 15:47:25 +08:00
Fangjun Kuang
4b791ced78
Fix CI tests (#1333) 2023-10-24 10:38:56 +08:00
zr_jin
f9980aa606
minor fixes (#1332) 2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support (#1329) 2023-10-24 01:10:50 +08:00
Fangjun Kuang
902dc2364a
Update docker for torch 2.1 (#1326) 2023-10-22 23:25:06 +08:00
Yifan Yang
416852e8a1
Add Zipformer recipe for GigaSpeech (#1254)
Co-authored-by: Yifan Yang <yifanyeung@qq.com>
Co-authored-by: yfy62 <yfy62@d3-hpc-sjtu-test-005.cm.cluster>
2023-10-21 15:36:59 +08:00