zr_jin
c3f6f28116
Zipformer recipe for Cantonese dataset MDCC ( #1537 )
...
* init commit
* Create README.md
* handle code switching cases
* misc. fixes
* added manifest statistics
* init commit for the zipformer recipe
* added scripts for exporting model
* added RESULTS.md
* added scripts for streaming related stuff
* doc str fixed
2024-03-13 10:01:28 +08:00
Fangjun Kuang
81f518ea7c
Support different tts model types. ( #1541 )
2024-03-12 22:29:21 +08:00
BannerWang
959906e9dc
Correct alimeeting download link ( #1544 )
...
Co-authored-by: BannerWang <banner.wang@upblocks.io>
2024-03-12 12:44:09 +08:00
jimmy1984xu
e472fa6840
fix CutMix init parameter ( #1543 )
...
Co-authored-by: jimmyxu <jimmyxu@upblocks.io>
2024-03-11 18:37:26 +08:00
Fangjun Kuang
60986c3ac1
Fix default value for --context-size in icefall. ( #1538 )
2024-03-08 20:47:13 +08:00
zr_jin
ae61bd4090
Minor fixes for the commonvoice
recipe ( #1534 )
...
* init commit
* fix for issue https://github.com/k2-fsa/icefall/issues/1531
* minor fixes
2024-03-08 11:01:11 +08:00
Yuekai Zhang
5df24c1685
Whisper large fine-tuning on wenetspeech, mutli-hans-zh ( #1483 )
...
* add whisper fbank for wenetspeech
* add whisper fbank for other dataset
* add str to bool
* add decode for wenetspeech
* add requirments.txt
* add original model decode with 30s
* test feature extractor speed
* add aishell2 feat
* change compute feature batch
* fix overwrite
* fix executor
* regression
* add kaldifeatwhisper fbank
* fix io issue
* parallel jobs
* use multi machines
* add wenetspeech fine-tune scripts
* add monkey patch codes
* remove useless file
* fix subsampling factor
* fix too long audios
* add remove long short
* fix whisper version to support multi batch beam
* decode all wav files
* remove utterance more than 30s in test_net
* only test net
* using soft links
* add kespeech whisper feats
* fix index error
* add manifests for whisper
* change to licomchunky writer
* add missing option
* decrease cpu usage
* add speed perturb for kespeech
* fix kespeech speed perturb
* add dataset
* load checkpoint from specific path
* add speechio
* add speechio results
---------
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2024-03-07 19:04:27 +08:00
zr_jin
cdb3fb5675
add text norm script for pl ( #1532 )
2024-03-07 18:47:29 +08:00
zr_jin
335a9962de
Fixed formatting issue of PR #1528 ( #1530 )
2024-03-06 08:43:45 +08:00
Rezakh20
ff430b465f
Add num_features to train.py for training WSASR ( #1528 )
2024-03-05 16:40:30 +08:00
zr_jin
242002e0bd
Strengthened style constraints ( #1527 )
2024-03-04 23:28:04 +08:00
Fangjun Kuang
29b195a42e
Update export-onnx.py for vits to support sherpa-onnx. ( #1524 )
2024-03-01 19:53:58 +08:00
zr_jin
58610b1bf6
Provides README.md
for TTS recipes ( #1491 )
...
* Update README.md
2024-02-29 17:31:28 +08:00
Xiaoyu Yang
7e2b561bbf
Add recipe for fine-tuning Zipformer with adapter ( #1512 )
2024-02-29 10:57:38 +08:00
Zengwei Yao
d89f4ea149
Use piper_phonemize as text tokenizer in ljspeech recipe ( #1511 )
...
* use piper_phonemize as text tokenizer in ljspeech recipe
* modify usage of tokenizer in vits/train.py
* update docs
2024-02-29 10:13:22 +08:00
Xiaoyu Yang
2483b8b4da
Zipformer recipe for SPGISpeech ( #1449 )
2024-02-22 15:53:19 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting ( #1428 )
...
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Zengwei Yao
b3e2044068
minor fix of vits/tokenizer.py ( #1504 )
...
* minor fix of vits/tokenizer.py
2024-02-19 19:33:32 +08:00
zr_jin
db4d66c0e3
Fixed softlink for ljspeech
recipe ( #1503 )
2024-02-19 16:13:09 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech ( #1493 )
...
* Refactor prepare.sh in librispeech, break it into three parts, prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Tiance Wang
4ed88d9484
Update shared ( #1487 )
...
There should be one more ../
2024-02-07 10:16:02 +08:00
Xiaoyu Yang
777074046d
Fine-tune recipe for Zipformer ( #1484 )
...
1. support finetune zipformer
2. update the usage; set a very large batch count
2024-02-06 18:25:43 +08:00
zr_jin
a813186f64
minor fix for docstr and default param. ( #1490 )
...
* Update train.py and README.md
2024-02-05 12:47:52 +08:00
Teo Wen Shen
b9e6327adf
Fixing torch.ctc err ( #1485 )
...
* fixing torch.ctc err
* Move targets & lengths to CPU
2024-02-03 06:25:27 +08:00
Henry Li Xinyuan
b07d5472c5
Implement recipe for Fluent Speech Commands dataset ( #1469 )
...
---------
Signed-off-by: Xinyuan Li <xli257@c13.clsp.jhu.edu>
2024-01-31 22:53:36 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech
( #1476 )
...
* Comply to issue #1149
https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 ( #1466 )
...
* add decode seamlessm4t
* add requirements
* add decoding with avg model
* add token files
* add custom tokenizer
* support deepspeed to finetune large model
* support large-v3
* add model saving
* using monkey patch to replace models
* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir ( #1475 )
2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py ( #1474 )
2024-01-26 15:50:11 +08:00
zr_jin
9c494a3329
typos fixed ( #1472 )
2024-01-25 18:41:43 +08:00
Triplecq
5d94a19026
prepare for 1000h dataset
2024-01-24 11:33:36 -05:00
Triplecq
d864da4d65
validation scripts
2024-01-25 01:25:28 +09:00
Triplecq
f35fa8aa8f
add blank penalty in decoding script
2024-01-23 17:10:10 -05:00
Triplecq
a8e9dc2488
all combinations of epochs and avgs
2024-01-23 21:12:17 +09:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler ( #1468 )
...
* Fix buffer size
* Fix for flake8
---------
Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
7bdde9174c
A Zipformer recipe with Byte-level BPE for Aishell-1 ( #1464 )
...
* init commit
* Update train.py
* Update decode.py
* Update RESULTS.md
* added `vocab_size`
* removed unused softlinks
* added scripts for testing pretrained models
* set `bpe_model` as required
* re-org the bbpe recipe for aishell
2024-01-16 21:08:35 +08:00
Triplecq
77178c6311
comment out params related to the chunk size
2024-01-14 17:35:20 -05:00
Triplecq
7b6a89749d
customize decoding script
2024-01-14 17:29:22 -05:00
Triplecq
04fa9e3e8c
traning script completed
2024-01-15 07:06:14 +09:00
Triplecq
42c152f5cb
decrease learning-rate to solve the error: RuntimeError: grad_scale is too small, exiting: 5.820766091346741e-11
2024-01-14 12:12:15 -05:00
Triplecq
ced8a53cdc
Merge branch 'master' into rs
2024-01-14 23:05:00 +09:00
Triplecq
819db8fcad
Merge branch 'master' of github.com:Triplecq/icefall
2024-01-14 23:00:19 +09:00
Triplecq
dc2d531540
customized recipes for rs
2024-01-14 22:28:53 +09:00
Triplecq
b1de6f266c
customized recipes for reazonspeech
2024-01-14 22:28:32 +09:00
Triplecq
1e6fe2eae1
restore
2024-01-14 08:05:49 -05:00
Triplecq
5e9a171b20
customize tranning script for rs
2024-01-14 07:45:33 -05:00
Triplecq
8eae6ec7d1
Add pruned_transducer_stateless2 from reazonspeech branch
2024-01-14 05:23:26 -05:00
Triplecq
af87726bf2
init zipformer recipe
2024-01-14 19:13:21 +09:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead ( #1450 )
...
* use shuffled LibriSpeech cuts instead
* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
zr_jin
b9b56eb879
Minor fixes to the VCTK data prep scripts ( #1441 )
...
* Update prepare.sh
2024-01-08 14:28:07 +08:00