root
9dc2a86754
update graph
2024-03-26 20:18:26 +09:00
root
3b36a67f07
update graph
2024-03-26 20:14:43 +09:00
root
1e25c96e42
update graph
2024-03-26 20:10:03 +09:00
root
baf6ebba90
delete graph
2024-03-26 20:09:11 +09:00
root
5e7db1afec
complete validation
2024-03-26 20:07:39 +09:00
root
456241bf61
update graph
2024-03-25 08:40:54 +09:00
root
03e8cfacca
validation test
2024-03-25 08:37:41 +09:00
root
860a6b27fa
complete exp on zipformer-L
2024-03-25 05:36:59 +09:00
Triplecq
5d94a19026
prepare for 1000h dataset
2024-01-24 11:33:36 -05:00
Triplecq
d864da4d65
validation scripts
2024-01-25 01:25:28 +09:00
Triplecq
f35fa8aa8f
add blank penalty in decoding script
2024-01-23 17:10:10 -05:00
Triplecq
a8e9dc2488
all combinations of epochs and avgs
2024-01-23 21:12:17 +09:00
Triplecq
77178c6311
comment out params related to the chunk size
2024-01-14 17:35:20 -05:00
Triplecq
7b6a89749d
customize decoding script
2024-01-14 17:29:22 -05:00
Triplecq
04fa9e3e8c
traning script completed
2024-01-15 07:06:14 +09:00
Triplecq
42c152f5cb
decrease learning-rate to solve the error: RuntimeError: grad_scale is too small, exiting: 5.820766091346741e-11
2024-01-14 12:12:15 -05:00
Triplecq
ced8a53cdc
Merge branch 'master' into rs
2024-01-14 23:05:00 +09:00
Triplecq
819db8fcad
Merge branch 'master' of github.com:Triplecq/icefall
2024-01-14 23:00:19 +09:00
Triplecq
dc2d531540
customized recipes for rs
2024-01-14 22:28:53 +09:00
Triplecq
b1de6f266c
customized recipes for reazonspeech
2024-01-14 22:28:32 +09:00
Triplecq
1e6fe2eae1
restore
2024-01-14 08:05:49 -05:00
Triplecq
5e9a171b20
customize tranning script for rs
2024-01-14 07:45:33 -05:00
Triplecq
8eae6ec7d1
Add pruned_transducer_stateless2 from reazonspeech branch
2024-01-14 05:23:26 -05:00
Triplecq
af87726bf2
init zipformer recipe
2024-01-14 19:13:21 +09:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead ( #1450 )
...
* use shuffled LibriSpeech cuts instead
* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
zr_jin
b9b56eb879
Minor fixes to the VCTK data prep scripts ( #1441 )
...
* Update prepare.sh
2024-01-08 14:28:07 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] ( #1448 )
...
- some AudioTransform classes produce audio signals out of range [-1,+1]
- Resample produced 1.0079
- The range [-10,+10] was chosen to still be able to reliably
distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. ( #1447 )
...
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
zr_jin
f42258caf8
Update compute_fbank_commonvoice_splits.py ( #1437 )
2023-12-30 13:03:26 +08:00
Chen
2436597f7f
Zipformer recipe
2023-12-28 05:37:40 +09:00
Ali Haznedaroğlu
ddd7131317
Update TTS export-onnx.py scripts for handling variable token counts ( #1430 )
2023-12-25 19:44:07 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py ( #1424 )
2023-12-23 00:38:36 +08:00
TianHao Zhang
702d4f5914
Update prepare.sh ( #1422 )
...
fix the bug in line 251:
1、 del the additional blank
2、correct the spell error of "new_vocab_size"
2023-12-21 14:42:33 +08:00
Chen
abbee8717a
Merge tag 'rs-experiment' of kdm00:/mnt/syno128/volume1/fujimotos/git/icefall
...
Experimental version for ReazonSpeech
2023-12-21 04:18:18 +09:00
Fujimoto Seiji
c1ce7ca9e3
Add first cut at ReazonSpeech recipe
...
This recipe is mostly based on egs/csj, but tweaked to the point that
can be run with ReazonSpeech corpus.
That being said, there are some big caveats:
* Currently the model quality is not very good. Actually, it is very
bad. I trained a model with 1000h corpus, and it resulted in >80%
CER on JSUT.
* The core issue seems that Zipformer is prone to ignore untterances
as sielent segments. It often produces an empty hypothesis despite
that the audio actually contains human voice.
* This issue is already reported in the upstream and not fully
resolved yet as of Dec 2023.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
2023-12-18 16:12:11 +09:00
zr_jin
10a234709c
bugs fixed ( #1416 )
2023-12-14 11:26:37 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. ( #1415 )
2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder ( #1413 )
...
* Create export-onnx-streaming-ctc.py
* doc_str updated
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
20a82c9abf
first commit ( #1411 )
2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 ( #1410 )
2023-12-10 11:38:39 +08:00
zr_jin
df56aff31e
minor fixes to the vits onnx exportation scripts ( #1408 )
2023-12-08 21:11:31 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. ( #1407 )
2023-12-08 14:29:24 +08:00
zr_jin
bda72f86ff
minor adjustments to the VITS recipes for onnx runtime ( #1405 )
2023-12-08 06:32:40 +08:00
zr_jin
735fb9a73d
A TTS recipe VITS on VCTK dataset ( #1380 )
...
* init
* isort formatted
* minor updates
* Create shared
* Update prepare_tokens_vctk.py
* Update prepare_tokens_vctk.py
* Update prepare_tokens_vctk.py
* Update prepare.sh
* updated
* Update train.py
* Update train.py
* Update tts_datamodule.py
* Update train.py
* Update train.py
* Update train.py
* Update train.py
* Update train.py
* Update train.py
* fixed formatting issue
* Update infer.py
* removed redundant files
* Create monotonic_align
* removed redundant files
* created symlinks
* Update prepare.sh
* minor adjustments
* Create requirements_tts.txt
* Update requirements_tts.txt
added version constraints
* Update infer.py
* Update infer.py
* Update infer.py
* updated docs
* Update export-onnx.py
* Update export-onnx.py
* Update test_onnx.py
* updated requirements.txt
* Update test_onnx.py
* Update test_onnx.py
* docs updated
* docs fixed
* minor updates
2023-12-06 09:59:19 +08:00
LoganLiu66
f08af2fa22
fix initial states ( #1398 )
...
Co-authored-by: liujiawang02 <liujiawang02@baidu.com>
2023-12-04 22:29:42 +08:00
Zengwei Yao
0622dea30d
Add a TTS recipe VITS on LJSpeech dataset ( #1372 )
...
* first commit
* replace phonimizer with g2p
* use Conformer as text encoder
* modify training script, clean codes
* rename directory
* convert text to tokens in data preparation stage
* fix tts_datamodule.py
* support onnx export and testing the exported onnx model
* add doc
* add README.md
* fix style
2023-11-29 21:28:38 +08:00
zr_jin
ae67f75e9c
a bilingual recipe similar to the multi-zh_hans
( #1265 )
2023-11-26 10:04:15 +08:00
Wei Kang
238b45bea8
Libriheavy recipe (zipformer) ( #1261 )
...
* initial commit for libriheavy
* Data prepare pipeline
* Fix train.py
* Fix decode.py
* Add results
* minor fixes
* black
* black
* Incorporate PR https://github.com/k2-fsa/icefall/pull/1269
---------
Co-authored-by: zr_jin <peter.jin.cn@gmail.com>
2023-11-23 01:22:57 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords ( #1385 )
...
* add custom score for each hotword
* Add more comments
* Fix deocde
* fix style
* minor fixes
2023-11-18 18:47:55 +08:00
Fangjun Kuang
666d69b20d
Rename train2.py to avoid confusion ( #1386 )
2023-11-17 18:12:59 +08:00