1029 Commits

Author SHA1 Message Date
root
baf6ebba90 delete graph 2024-03-26 20:09:11 +09:00
root
5e7db1afec complete validation 2024-03-26 20:07:39 +09:00
root
456241bf61 update graph 2024-03-25 08:40:54 +09:00
root
03e8cfacca validation test 2024-03-25 08:37:41 +09:00
root
860a6b27fa complete exp on zipformer-L 2024-03-25 05:36:59 +09:00
Triplecq
5d94a19026 prepare for 1000h dataset 2024-01-24 11:33:36 -05:00
Triplecq
d864da4d65 validation scripts 2024-01-25 01:25:28 +09:00
Triplecq
f35fa8aa8f add blank penalty in decoding script 2024-01-23 17:10:10 -05:00
Triplecq
a8e9dc2488 all combinations of epochs and avgs 2024-01-23 21:12:17 +09:00
Triplecq
77178c6311 comment out params related to the chunk size 2024-01-14 17:35:20 -05:00
Triplecq
7b6a89749d customize decoding script 2024-01-14 17:29:22 -05:00
Triplecq
04fa9e3e8c traning script completed 2024-01-15 07:06:14 +09:00
Triplecq
42c152f5cb decrease learning-rate to solve the error: RuntimeError: grad_scale is too small, exiting: 5.820766091346741e-11 2024-01-14 12:12:15 -05:00
Triplecq
ced8a53cdc Merge branch 'master' into rs 2024-01-14 23:05:00 +09:00
Triplecq
819db8fcad Merge branch 'master' of github.com:Triplecq/icefall 2024-01-14 23:00:19 +09:00
Triplecq
dc2d531540 customized recipes for rs 2024-01-14 22:28:53 +09:00
Triplecq
b1de6f266c customized recipes for reazonspeech 2024-01-14 22:28:32 +09:00
Triplecq
1e6fe2eae1 restore 2024-01-14 08:05:49 -05:00
Triplecq
5e9a171b20 customize tranning script for rs 2024-01-14 07:45:33 -05:00
Triplecq
8eae6ec7d1 Add pruned_transducer_stateless2 from reazonspeech branch 2024-01-14 05:23:26 -05:00
Triplecq
af87726bf2 init zipformer recipe 2024-01-14 19:13:21 +09:00
Fangjun Kuang
398401ed27
Update kaldifeat installation doc (#1460) 2024-01-14 14:38:41 +08:00
Xiaoyu Yang
e2fcb42f5f
fix typo (#1455) 2024-01-09 15:41:37 +08:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead (#1450)
* use shuffled LibriSpeech cuts instead

* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
zr_jin
b9b56eb879
Minor fixes to the VCTK data prep scripts (#1441)
* Update prepare.sh
2024-01-08 14:28:07 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] (#1448)
- some AudioTransform classes produce audio signals out of range [-1,+1]
   - Resample produced 1.0079
   - The range [-10,+10] was chosen to still be able to reliably
     distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. (#1447)
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
zr_jin
f42258caf8
Update compute_fbank_commonvoice_splits.py (#1437) 2023-12-30 13:03:26 +08:00
Chen
2436597f7f Zipformer recipe 2023-12-28 05:37:40 +09:00
Fangjun Kuang
140e6381ad
Refactor CI tests for librispeech (#1436) 2023-12-27 13:21:14 +08:00
Fangjun Kuang
db52fe2349
Refactor CI test for aishell (#1435) 2023-12-26 20:29:43 +08:00
Fangjun Kuang
835a92eba5
Add doc about how to use the CPU-only docker images (#1432) 2023-12-25 20:23:56 +08:00
Ali Haznedaroğlu
ddd7131317
Update TTS export-onnx.py scripts for handling variable token counts (#1430) 2023-12-25 19:44:07 +08:00
Fangjun Kuang
c855a58cfd
Generate the dependency matrix by code for GitHub Actions (#1431) 2023-12-25 19:41:09 +08:00
Fangjun Kuang
e5bb1ae86c
Use the CPU docker in CI to simplify the test code (#1427) 2023-12-24 13:40:33 +08:00
Fangjun Kuang
79a42148db
Add CI test to cover zipformer/train.py (#1424) 2023-12-23 00:38:36 +08:00
TianHao Zhang
702d4f5914
Update prepare.sh (#1422)
fix the bug in line 251:
1、 del the additional blank
2、correct the spell error of "new_vocab_size"
2023-12-21 14:42:33 +08:00
Chen
abbee8717a Merge tag 'rs-experiment' of kdm00:/mnt/syno128/volume1/fujimotos/git/icefall
Experimental version for ReazonSpeech
2023-12-21 04:18:18 +09:00
Triplecq
a82e0019ef
Merge branch 'k2-fsa:master' into master 2023-12-20 13:19:24 -05:00
Fujimoto Seiji
c1ce7ca9e3 Add first cut at ReazonSpeech recipe
This recipe is mostly based on egs/csj, but tweaked to the point that
can be run with ReazonSpeech corpus.

That being said, there are some big caveats:

 * Currently the model quality is not very good. Actually, it is very
   bad. I trained a model with 1000h corpus, and it resulted in >80%
   CER on JSUT.

 * The core issue seems that Zipformer is prone to ignore untterances
   as sielent segments. It often produces an empty hypothesis despite
   that the audio actually contains human voice.

 * This issue is already reported in the upstream and not fully
   resolved yet as of Dec 2023.

Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
2023-12-18 16:12:11 +09:00
Fujimoto Seiji
16c02cfcc2 Merge latest commit 'b0f70c9' on k2-fsa/icefall
I needed this in order to pull unreleased fixes. The last tagged version
was too old (dated back in Jul 2023), and not compatible with recent
lhotse releases.
2023-12-18 15:08:41 +09:00
zr_jin
10a234709c
bugs fixed (#1416) 2023-12-14 11:26:37 +08:00
Fangjun Kuang
f85f0252a9
Add greedy search for streaming zipformer CTC. (#1415) 2023-12-13 17:34:12 +08:00
zr_jin
d0da509055
Support ONNX export for Streaming CTC Encoder (#1413)
* Create export-onnx-streaming-ctc.py

* doc_str updated

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-12-13 10:33:28 +08:00
Fangjun Kuang
9e9fe7954d
Upload gigaspeech zipformer models in CI (#1412) 2023-12-12 18:57:04 +08:00
Fangjun Kuang
20a82c9abf
first commit (#1411) 2023-12-12 18:13:26 +08:00
Fangjun Kuang
b0f70c9d04
Fix torch.jit.script() export for pruned_transducer_stateless2 (#1410) 2023-12-10 11:38:39 +08:00
zr_jin
df56aff31e
minor fixes to the vits onnx exportation scripts (#1408) 2023-12-08 21:11:31 +08:00
Fangjun Kuang
e9ec827de7
Rename zipformer2 to zipformer_for_ncnn_export_only to avoid confusion. (#1407) 2023-12-08 14:29:24 +08:00
zr_jin
bda72f86ff
minor adjustments to the VITS recipes for onnx runtime (#1405) 2023-12-08 06:32:40 +08:00