904 Commits

Author SHA1 Message Date
marcoyang1998
80843bc972 minor fix 2023-07-27 12:15:34 +08:00
marcoyang1998
a5cc599ea7 resolve conflict 2023-07-27 11:52:47 +08:00
marcoyang1998
3cecabdc84 minor updates 2023-07-27 11:49:34 +08:00
marcoyang1998
f05778b1ab add some descriptions 2023-07-27 11:40:32 +08:00
kobenaxie
80d922c158
Update preprocess_commonvoice.py to fix text normalization bug. (#1181) 2023-07-26 16:54:42 +08:00
Fangjun Kuang
1dbbd7759e
Add tests for subsample.py and fix typos (#1180) 2023-07-25 14:46:18 +08:00
zr_jin
4ab7d61008
removed batch_name to fix a KeyError with "uttid" (#1172) 2023-07-15 12:39:32 +08:00
marcoyang1998
5ed6fc0e6d
add sym link (#1170) 2023-07-12 15:37:14 +08:00
Desh Raj
41b16d7838
SURT recipe for AMI and ICSI (#1133)
* merge upstream

* add SURT model and training

* add libricss decoding

* add chunk width randomization

* decode SURT with libricss

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* clean commit for SURT recipe

* training libricss surt model

* remove unwanted files

* remove unwanted changes

* remove changes in librispeech

* change some files to symlinks

* remove unwanted changes in utils

* add export script

* add README

* minor fix in README

* add assets for README

* replace some files with symlinks

* remove unused decoding methods

* initial commit for SURT AMI recipe

* fix symlink

* add train + decode scripts

* add missing symlink

* change files to symlink

* change file type
2023-07-08 23:01:51 +08:00
Yifan Yang
ffe816e2a8
Fix blank skip ci test (#1167)
* Fix for ci

* Fix frame_reducer
2023-07-06 23:12:41 +08:00
marcoyang1998
11523c5b89
Shallow fusion & LODR documentation (#1142)
* add shallow fusion documentation

* add documentation for LODR

* upload docs for LM rescoring
2023-07-06 19:11:01 +08:00
marcoyang
66318c8d1a upload docs for LM rescoring 2023-07-05 14:46:42 +08:00
marcoyang
8b6e8d0bd4 add item 2023-07-05 14:46:18 +08:00
Fangjun Kuang
6fd674312c
Fix failed CI tests (#1166) v1.1 2023-07-05 10:52:34 +08:00
Fangjun Kuang
130ad0319d
Fix CI test for zipformer CTC (#1165) 2023-07-05 10:38:29 +08:00
Fangjun Kuang
b8a17944e4
Fix zipformer CI test (#1164) 2023-07-05 10:23:35 +08:00
Desh Raj
a4402b88e6
SURT multi-talker ASR recipe (#1126)
* merge upstream

* add SURT model and training

* add libricss decoding

* add chunk width randomization

* decode SURT with libricss

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* clean commit for SURT recipe

* training libricss surt model

* remove unwanted files

* remove unwanted changes

* remove changes in librispeech

* change some files to symlinks

* remove unwanted changes in utils

* add export script

* add README

* minor fix in README

* add assets for README

* replace some files with symlinks

* remove unused decoding methods

* fix symlink

* address comments from @csukuangfj
2023-07-04 19:25:58 +08:00
zr_jin
856c0f2a60
fixed default param for an aishell recipe (#1159) 2023-07-04 19:12:39 +08:00
Nickolay V. Shmyrev
eca0202632
Add start-batch option for RNNLM training (#1161)
* Add start-batch option for RNNLM training

* Also set epoch

* Skip batches on load
2023-07-04 10:13:25 +08:00
Fangjun Kuang
9009d028a0
Fix ONNX export for the latest non-streaming zipformer. (#1160) 2023-07-03 23:56:51 +08:00
Fangjun Kuang
c3e23ec8d2
Fix logaddexp for ONNX export (#1158) 2023-07-02 10:30:09 +08:00
MicKot
98d89463f6
zipformer2 logaddexp onnx safe (#1157) 2023-06-30 21:16:40 +08:00
Zengwei Yao
ccd8c624dd
support testing onnx exported model on the test sets (#1150)
* support testing onnx exported model on the test sets

* use token_table instead
2023-06-30 12:05:37 +08:00
Desh Raj
c59c89fc13
Minor fix in tedlium results file (#1153) 2023-06-29 13:09:01 +02:00
Wei Kang
db71b03026
Support int8 quantization in decoder (#1152) 2023-06-29 16:48:59 +08:00
marcoyang
d3c0c797a2 minor fix 2023-06-29 16:45:23 +08:00
marcoyang
a6462490fb minor fixes 2023-06-29 12:33:26 +08:00
marcoyang1998
2f1af8f303
Update docs/source/decoding-with-langugage-models/LODR.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:10:28 +08:00
marcoyang1998
e429a152e9
Update docs/source/decoding-with-langugage-models/shallow-fusion.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:10:20 +08:00
marcoyang1998
85a8a0a130
Update docs/source/decoding-with-langugage-models/shallow-fusion.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:10:07 +08:00
marcoyang1998
8be9f0d562
Update docs/source/decoding-with-langugage-models/shallow-fusion.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:10:00 +08:00
marcoyang1998
b55dd5e364
Update docs/source/decoding-with-langugage-models/LODR.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:09:53 +08:00
marcoyang1998
c0709c8107
Update docs/source/decoding-with-langugage-models/LODR.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:09:40 +08:00
marcoyang1998
78fec8ef6f
Update docs/source/decoding-with-langugage-models/LODR.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:08:12 +08:00
marcoyang1998
5ff647e226
Update docs/source/decoding-with-langugage-models/LODR.rst
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-29 12:08:02 +08:00
marcoyang
ec942c25cf minor fixes 2023-06-29 10:16:15 +08:00
marcoyang
3b3ada765c minor fixes 2023-06-28 17:25:42 +08:00
marcoyang
34682d3b07 minor updates 2023-06-28 17:05:08 +08:00
marcoyang
3207ceab46 update documentation for shallow fusion 2023-06-28 16:53:09 +08:00
marcoyang
2ada280379 add documentation for LODR 2023-06-28 16:52:57 +08:00
Desh Raj
9c2172c1c4
Zipformer for TedLium (#1125)
* initial commit for zipformer tedlium

* fix unk decoding

* add pretrained model and logs

* update for new AsrModel

* add option for choosing rnnt type

* add results with modified rnnt
2023-06-28 16:43:49 +08:00
Fangjun Kuang
968ebd236b
Fix ONNX export of the latest streaming zipformer model. (#1148) 2023-06-27 14:35:59 +08:00
Wei Kang
219bba1310
zipformer wenetspeech (#1130)
* copy files

* update train.py

* small fixes

* Add decode.py

* Fix dataloader in decode.py

* add blank penalty

* Add blank-penalty to other decoding method

* Minor fixes

* add zipformer2 recipe

* Minor fixes

* Remove pruned7

* export and test models

* Replace bpe with tokens in export.py and pretrain.py

* Minor fixes

* Minor fixes

* Minor fixes

* Fix export

* Update results

* Fix zipformer-ctc

* Fix ci

* Fix ci

* Fix CI

* Fix CI

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
marcoyang
8abe24cc77 add LODR 2023-06-25 18:39:29 +08:00
frankyoujian
4d5b8369ae
fix small typo (#1144) 2023-06-21 17:17:19 +08:00
marcoyang
0fbdadfe7b change wording 2023-06-20 17:09:52 +08:00
marcoyang
542bbc936e minor fix 2023-06-20 17:02:32 +08:00
marcoyang
645e2a5ed8 add shallow fusion documentation 2023-06-20 17:02:21 +08:00
marcoyang
ad24b4ad9e resolve conflict 2023-06-19 12:31:35 +08:00
Yifan Yang
d667dc365b
Fix for diagnostic (#1135)
* CTC loss return tensor

* Update model.py
2023-06-16 15:04:41 +08:00