911 Commits

Author SHA1 Message Date
Desh Raj
3ef33f2126 merge upstream 2023-07-04 07:37:42 -04:00
Desh Raj
a4402b88e6
SURT multi-talker ASR recipe (#1126)
* merge upstream

* add SURT model and training

* add libricss decoding

* add chunk width randomization

* decode SURT with libricss

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* clean commit for SURT recipe

* training libricss surt model

* remove unwanted files

* remove unwanted changes

* remove changes in librispeech

* change some files to symlinks

* remove unwanted changes in utils

* add export script

* add README

* minor fix in README

* add assets for README

* replace some files with symlinks

* remove unused decoding methods

* fix symlink

* address comments from @csukuangfj
2023-07-04 19:25:58 +08:00
zr_jin
856c0f2a60
fixed default param for an aishell recipe (#1159) 2023-07-04 19:12:39 +08:00
Nickolay V. Shmyrev
eca0202632
Add start-batch option for RNNLM training (#1161)
* Add start-batch option for RNNLM training

* Also set epoch

* Skip batches on load
2023-07-04 10:13:25 +08:00
Fangjun Kuang
9009d028a0
Fix ONNX export for the latest non-streaming zipformer. (#1160) 2023-07-03 23:56:51 +08:00
Fangjun Kuang
c3e23ec8d2
Fix logaddexp for ONNX export (#1158) 2023-07-02 10:30:09 +08:00
MicKot
98d89463f6
zipformer2 logaddexp onnx safe (#1157) 2023-06-30 21:16:40 +08:00
Zengwei Yao
ccd8c624dd
support testing onnx exported model on the test sets (#1150)
* support testing onnx exported model on the test sets

* use token_table instead
2023-06-30 12:05:37 +08:00
Desh Raj
c59c89fc13
Minor fix in tedlium results file (#1153) 2023-06-29 13:09:01 +02:00
Wei Kang
db71b03026
Support int8 quantization in decoder (#1152) 2023-06-29 16:48:59 +08:00
Desh Raj
9c2172c1c4
Zipformer for TedLium (#1125)
* initial commit for zipformer tedlium

* fix unk decoding

* add pretrained model and logs

* update for new AsrModel

* add option for choosing rnnt type

* add results with modified rnnt
2023-06-28 16:43:49 +08:00
Desh Raj
30cc677b2b add missing symlink 2023-06-27 10:53:19 -04:00
Fangjun Kuang
968ebd236b
Fix ONNX export of the latest streaming zipformer model. (#1148) 2023-06-27 14:35:59 +08:00
Wei Kang
219bba1310
zipformer wenetspeech (#1130)
* copy files

* update train.py

* small fixes

* Add decode.py

* Fix dataloader in decode.py

* add blank penalty

* Add blank-penalty to other decoding method

* Minor fixes

* add zipformer2 recipe

* Minor fixes

* Remove pruned7

* export and test models

* Replace bpe with tokens in export.py and pretrain.py

* Minor fixes

* Minor fixes

* Minor fixes

* Fix export

* Update results

* Fix zipformer-ctc

* Fix ci

* Fix ci

* Fix CI

* Fix CI

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
frankyoujian
4d5b8369ae
fix small typo (#1144) 2023-06-21 17:17:19 +08:00
Desh Raj
9a288b85cc Merge branch 'surt' into surt_ami 2023-06-20 10:58:09 -04:00
Desh Raj
75e5c2775d Merge branch 'master' of https://github.com/k2-fsa/icefall into surt 2023-06-20 10:57:48 -04:00
Desh Raj
d80ed9377f add train + decode scripts 2023-06-18 05:27:11 -04:00
Desh Raj
8d70a2aeca Merge branch 'surt' into surt_ami 2023-06-18 05:12:55 -04:00
Desh Raj
1bed8d86ca fix symlink 2023-06-18 05:12:03 -04:00
Yifan Yang
d667dc365b
Fix for diagnostic (#1135)
* CTC loss return tensor

* Update model.py
2023-06-16 15:04:41 +08:00
Desh Raj
14818f5dd8 initial commit for SURT AMI recipe 2023-06-15 14:34:43 -04:00
Yifan Yang
0a465794a8
Fix Zipformer (#1132)
* Update model.py

* Update train.py

* Update decoder.py
2023-06-15 17:52:14 +08:00
Fangjun Kuang
947f0614c9
Fix running exported model on GPU. (#1131) 2023-06-15 12:25:15 +08:00
Desh Raj
d6b88aaa98 remove unused decoding methods 2023-06-14 04:12:52 -04:00
Desh Raj
92f6128127 replace some files with symlinks 2023-06-14 03:58:07 -04:00
Zengwei Yao
0ad037d076
Add CTC loss option in zipformer recipe (#1111)
* add CTC loss option in zipformer recipe

* add ctc_decode.py

* support CTC model export, add jit_pretrained_ctc.py, pretrained_ctc.py

* update README.md and RESULTS.md

* add CI test
2023-06-14 14:27:29 +08:00
Desh Raj
058385a2ea add assets for README 2023-06-13 11:04:43 -04:00
Desh Raj
dd9b442fce minor fix in README 2023-06-13 10:56:17 -04:00
Desh Raj
738370f231 add README 2023-06-13 10:52:59 -04:00
Desh Raj
08a0f8707a add export script 2023-06-13 08:53:34 -04:00
Desh Raj
d6adf25c06 remove unwanted changes in utils 2023-06-13 08:42:38 -04:00
Desh Raj
2d3063becd change some files to symlinks 2023-06-13 08:24:20 -04:00
Desh Raj
93a5c878f1 remove changes in librispeech 2023-06-13 08:14:11 -04:00
Desh Raj
494e88bcb7 Merge branch 'master' of https://github.com/k2-fsa/icefall into surt 2023-06-13 08:06:05 -04:00
Desh Raj
8623a1bcb2 remove unwanted changes 2023-06-13 08:02:40 -04:00
Desh Raj
0cad336277 remove unwanted files 2023-06-13 07:59:05 -04:00
Desh Raj
d50cef82cc training libricss surt model 2023-06-12 16:43:32 -04:00
danfu
0cb71ad3bc
add updated zipformer onnx export (#1108)
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-12 14:02:23 +08:00
Peter Ross
b4c38d7547
Use symlinks for best epochs (#1123)
* utils: add symlink_or_copyfile

* pruned_transducer_stateless7: use symlinks (when possible) to output best epochs

* Rename function

---------

Co-authored-by: Yifan Yang <64255737+yfyeung@users.noreply.github.com>
2023-06-12 13:51:46 +08:00
Desh Raj
9ed22396a9 merge upstream 2023-06-11 16:43:17 -04:00
Desh Raj
42daafee4e clean commit for SURT recipe 2023-06-11 16:32:29 -04:00
Yifan Yang
dca21c2a17
Fix parameters_names in train.py (#1121) 2023-06-08 16:54:05 +08:00
SarahSmitho
3ae47a4940
verify have installed ffmpeg (#1117) 2023-06-07 11:17:38 +08:00
Fangjun Kuang
c0de78d3c0
Add data preparation for the MuST-C speech translation corpus (#1107) 2023-06-05 15:49:41 +08:00
Wei Kang
ba257efbcd
Add Context biasing (#1038)
* Add context biasing for librispeech

* Add context biasing for wenetspeech

* fix bugs

* Implement Aho-Corasick context graph

* fix some bugs

* Fixes to forward_one_step; add draw to context graph

* add output arc; fix black

* Fix wenetspeech tokenizer

* Minor fixes to the decode.py
2023-06-03 21:28:49 +08:00
Yifan Yang
ca60ced213
Fix typo (#1114)
* Fix typo for zipformer

* Fix typo for pruned_transducer_stateless7

* Fix typo for pruned_transducer_stateless7_ctc

* Fix typo for pruned_transducer_stateless7_ctc_bs

* Fix typo for pruned_transducer_stateless7_streaming

* Fix typo for pruned_transducer_stateless7_streaming_multi

* Fix file permissions for pruned_transducer_stateless7_streaming_multi

* Fix typo for pruned_transducer_stateless8

* Fix typo for pruned_transducer_stateless6

* Fix typo for pruned_transducer_stateless5

* Fix typo for pruned_transducer_stateless4

* Fix typo for pruned_transducer_stateless3
2023-06-02 14:12:42 +08:00
Yifan Yang
82f34a2388
Remove multidataset from librispeech/pruned_transducer_stateless7 (#1105)
* Add People's Speech to multidataset

* update

* remove multi from librispeech
2023-06-01 18:45:20 +08:00
Zengwei Yao
7a604057f9
update diagnostics, print limits in Balancer, merge changes from Dan's branch zlm59 (#1109) 2023-06-01 14:24:19 +08:00
Yifan Yang
03853f1ee5
Add peoples_speech (#1101)
* update

* Small fix

* Update egs/peoples_speech/ASR/prepare.sh

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* limit normalize log

* Update egs/peoples_speech/ASR/local/compute_fbank_peoples_speech_valid_test.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Update compute_fbank_peoples_speech_splits.py

* Update compute_fbank_peoples_speech_valid_test.py

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-05-31 12:46:17 +08:00