Yifan Yang
482c24eab0
Merge pull request #2 from zzasdf/k2ssl-util
...
checkpoint convert script
2024-03-19 17:44:51 +08:00
zzasdf
ac73f60f5f
format
2024-03-19 17:34:06 +08:00
zzasdf
952abee88c
add checkpoint convert script
2024-03-19 17:29:34 +08:00
Yifan Yang
ea0b6311f1
Merge branch 'k2-fsa:master' into k2ssl
2024-03-13 19:52:14 +08:00
Fangjun Kuang
15bd9a841e
add CI for ljspeech ( #1548 )
2024-03-13 17:39:01 +08:00
Fangjun Kuang
d406b41cbd
Doc: Add page for installing piper-phonemize ( #1547 )
2024-03-13 11:01:18 +08:00
zr_jin
c3f6f28116
Zipformer recipe for Cantonese dataset MDCC ( #1537 )
...
* init commit
* Create README.md
* handle code switching cases
* misc. fixes
* added manifest statistics
* init commit for the zipformer recipe
* added scripts for exporting model
* added RESULTS.md
* added scripts for streaming related stuff
* doc str fixed
2024-03-13 10:01:28 +08:00
yifanyeung
9321f8ab7a
add librilight
2024-03-13 00:14:20 +08:00
Fangjun Kuang
81f518ea7c
Support different tts model types. ( #1541 )
2024-03-12 22:29:21 +08:00
BannerWang
959906e9dc
Correct alimeeting download link ( #1544 )
...
Co-authored-by: BannerWang <banner.wang@upblocks.io>
2024-03-12 12:44:09 +08:00
jimmy1984xu
e472fa6840
fix CutMix init parameter ( #1543 )
...
Co-authored-by: jimmyxu <jimmyxu@upblocks.io>
2024-03-11 18:37:26 +08:00
Yifan Yang
660f647886
Merge branch 'k2-fsa:master' into k2ssl
2024-03-10 13:10:36 +08:00
Fangjun Kuang
60986c3ac1
Fix default value for --context-size in icefall. ( #1538 )
2024-03-08 20:47:13 +08:00
zr_jin
ae61bd4090
Minor fixes for the commonvoice
recipe ( #1534 )
...
* init commit
* fix for issue https://github.com/k2-fsa/icefall/issues/1531
* minor fixes
2024-03-08 11:01:11 +08:00
Yuekai Zhang
5df24c1685
Whisper large fine-tuning on wenetspeech, mutli-hans-zh ( #1483 )
...
* add whisper fbank for wenetspeech
* add whisper fbank for other dataset
* add str to bool
* add decode for wenetspeech
* add requirments.txt
* add original model decode with 30s
* test feature extractor speed
* add aishell2 feat
* change compute feature batch
* fix overwrite
* fix executor
* regression
* add kaldifeatwhisper fbank
* fix io issue
* parallel jobs
* use multi machines
* add wenetspeech fine-tune scripts
* add monkey patch codes
* remove useless file
* fix subsampling factor
* fix too long audios
* add remove long short
* fix whisper version to support multi batch beam
* decode all wav files
* remove utterance more than 30s in test_net
* only test net
* using soft links
* add kespeech whisper feats
* fix index error
* add manifests for whisper
* change to licomchunky writer
* add missing option
* decrease cpu usage
* add speed perturb for kespeech
* fix kespeech speed perturb
* add dataset
* load checkpoint from specific path
* add speechio
* add speechio results
---------
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2024-03-07 19:04:27 +08:00
zr_jin
cdb3fb5675
add text norm script for pl ( #1532 )
2024-03-07 18:47:29 +08:00
zr_jin
335a9962de
Fixed formatting issue of PR #1528 ( #1530 )
2024-03-06 08:43:45 +08:00
Rezakh20
ff430b465f
Add num_features to train.py for training WSASR ( #1528 )
2024-03-05 16:40:30 +08:00
zr_jin
242002e0bd
Strengthened style constraints ( #1527 )
2024-03-04 23:28:04 +08:00
yifanyeung
bed950dbcb
update comments
2024-03-01 20:14:23 +08:00
Fangjun Kuang
29b195a42e
Update export-onnx.py for vits to support sherpa-onnx. ( #1524 )
2024-03-01 19:53:58 +08:00
zr_jin
58610b1bf6
Provides README.md
for TTS recipes ( #1491 )
...
* Update README.md
2024-02-29 17:31:28 +08:00
Fangjun Kuang
2f102eb989
Add CUDA docker image for torch 2.2.1 ( #1521 )
2024-02-29 11:41:18 +08:00
Xiaoyu Yang
7e2b561bbf
Add recipe for fine-tuning Zipformer with adapter ( #1512 )
2024-02-29 10:57:38 +08:00
Zengwei Yao
d89f4ea149
Use piper_phonemize as text tokenizer in ljspeech recipe ( #1511 )
...
* use piper_phonemize as text tokenizer in ljspeech recipe
* modify usage of tokenizer in vits/train.py
* update docs
2024-02-29 10:13:22 +08:00
yifanyeung
99044e1c2b
Fix bugs in HubertDataset
2024-02-27 22:13:39 +08:00
Yifan Yang
8515d92f47
Update ssl_datamodule.py
2024-02-27 10:54:54 +08:00
Yifan Yang
bb266b7ef8
Merge branch 'k2-fsa:master' into k2ssl
2024-02-27 10:48:48 +08:00
Fangjun Kuang
291d06056c
Support torch 2.2.1 for cpu docker. ( #1516 )
2024-02-23 14:24:13 +08:00
Yifan Yang
f2f102dae7
Merge branch 'k2-fsa:master' into k2ssl
2024-02-22 17:45:42 +08:00
Xiaoyu Yang
2483b8b4da
Zipformer recipe for SPGISpeech ( #1449 )
2024-02-22 15:53:19 +08:00
Wei Kang
819bb45539
Add pypinyin to requirements ( #1515 )
2024-02-22 15:50:11 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting ( #1428 )
...
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Xiaoyu Yang
13daf73468
docs for finetune zipformer ( #1509 )
2024-02-21 18:06:27 +08:00
Wei Kang
c19b414778
Update docker (adding pypinyin ( #1513 )
...
Update docker (adding pypinyin)
2024-02-21 08:04:16 +08:00
zr_jin
027302c902
minor fix for param. names ( #1495 )
2024-02-20 14:38:51 +08:00
Karel Vesely
e59fa38e86
docs: minor fixes of LM rescoring texts ( #1498 )
2024-02-20 10:40:15 +08:00
Zengwei Yao
b3e2044068
minor fix of vits/tokenizer.py ( #1504 )
...
* minor fix of vits/tokenizer.py
2024-02-19 19:33:32 +08:00
zr_jin
db4d66c0e3
Fixed softlink for ljspeech
recipe ( #1503 )
2024-02-19 16:13:09 +08:00
Fangjun Kuang
7eb360d0d5
Fix cpu docker images for torch 2.2.0 ( #1502 )
2024-02-18 20:32:40 +08:00
Fangjun Kuang
17688476e5
Provider docker images for torch 2.2.0 ( #1501 )
2024-02-18 14:56:04 +08:00
yifanyeung
911bfacffd
fix for black
2024-02-18 13:24:02 +08:00
yifanyeung
c0a5601c3d
fix for black
2024-02-18 13:15:56 +08:00
yifanyeung
809bdb07f0
fix for black
2024-02-18 12:44:44 +08:00
yifanyeung
b070d04ae8
fix flake8
2024-02-18 12:36:47 +08:00
Fangjun Kuang
06b356a610
Update cpu docker images to support torch 2.2.0 ( #1499 )
2024-02-18 12:05:38 +08:00
yifanyeung
a2bf39a531
Add k2SSL
2024-02-18 11:44:26 +08:00
safarisadegh
d9ae8c02a0
Update README.md ( #1497 )
2024-02-09 15:05:01 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech ( #1493 )
...
* Refactor prepare.sh in librispeech, break it into three parts, prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Tiance Wang
4ed88d9484
Update shared ( #1487 )
...
There should be one more ../
2024-02-07 10:16:02 +08:00