978 Commits

Author SHA1 Message Date
Zengwei Yao
1e6d6f8160
shuffle full Librispeech for zipformer recipes (#869)
* shuffle libri
2023-02-03 11:54:57 +08:00
Yifan Yang
e36ea89112
update result.md for pruned_transducer_stateless7_ctc_bs (#865) 2023-02-01 21:04:56 +08:00
Yifan Yang
d8234e199c
Add export to ONNX for Zipformer+CTC using blank skip (#861)
* Add export to ONNX for Zipformer+CTC using blank skip

---------

Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2023-01-31 15:57:03 +08:00
BuaaAlban
e9019511eb
Fix bug in streaming_conformer_ctc egs (#862)
* Update train.py

Fix transducer lstm egs bug as mentioned in issue 579

* Update train.py

fix dataloader bug
2023-01-31 15:19:50 +08:00
Yifan Yang
e277e31e37
update huggingface link of zipformer_ctc_blankskip.rst (#858)
* update huggingface link

* update link

---------

Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2023-01-29 15:35:36 +08:00
Meng Wei
74a2069f94
fix expired links (#856) 2023-01-28 14:43:47 +08:00
Teo Wen Shen
1ce2bc1ee0
edit comments (#852) 2023-01-28 13:47:21 +08:00
Zengwei Yao
6b1ab71dc9
hardcode --filter-uneven-sized-batch (#854) 2023-01-27 21:24:12 +08:00
Wei Kang
f5ff7a18eb
Fix the unclear description for streaming model (#849) 2023-01-17 11:28:59 +08:00
Fangjun Kuang
0af3e7beda
fix export for stateless4 (#844) 2023-01-16 20:26:36 +08:00
Zengwei Yao
2a463a420d
Filter uneven-sized batch (#843)
* add filter_uneven_sized_batch fucntion

* set --filter-uneven-sized-batch=True as default
2023-01-16 20:15:35 +08:00
Fangjun Kuang
5c8e9628cc
update faq for libpython3.10.so not found (#838) 2023-01-13 15:21:29 +08:00
Fangjun Kuang
958dbb3a1d
add doc for int8 quantization with sherpa-ncnn (#832)
* add doc for int8 quantization with sherpa-ncnn

* typo fixes
2023-01-11 20:29:36 +08:00
marcoyang1998
142420b3af
Add docs for distillation (#812)
* add README to docs

* update documents for distillation

* upload png files
2023-01-11 16:45:24 +08:00
Fangjun Kuang
8582b6e41a
Add doc about converting conv-emformer to sherpa-ncnn (#830) 2023-01-11 15:34:30 +08:00
Fangjun Kuang
c05f5d76df
fix decoding for ncnn (#828) 2023-01-10 20:52:13 +08:00
Fangjun Kuang
fcffa593f0
Add FAQs to doc (#827)
* Add FAQs

* small fixes
2023-01-10 15:38:33 +08:00
marcoyang1998
42cc10117e
Fix ncnn install (#824)
* add README to docs

* fix ncnn installation
2023-01-09 15:08:39 +08:00
Fangjun Kuang
9453eb1c70
Fix doc for building ncnn (#822) 2023-01-06 17:00:27 +08:00
kobenaxie
9a9c5a0f9b
remove unused codes. (#821) 2023-01-06 11:16:22 +08:00
Yifan Yang
b9626f2e06
fix typo for ctc-decode.py (#815)
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2023-01-05 17:18:43 +08:00
Fangjun Kuang
8642dbc0bd
Fix setup_dist (#806) 2023-01-04 12:21:19 +08:00
Yunusemre
0f26edfde9
Add Zipformer Onnx Support (#778)
* add export script

* add zipformer onnx pretrained script

* add onnx zipformer test

* fix style

* add zipformer onnx to workflow

* replace is_in_onnx_export with is_tracing

* add github.event.label.name == 'onnx'

* add is_tracing to necessary conditions

* fix pooling_mask

* add onnx_check

* add onnx_check to scripts

* add is_tracing to scaling.py
2023-01-03 16:59:44 +08:00
marcoyang1998
80cce141b4
Full libri fix manifest (#804)
* modify the name of the directory of vq manifest

* fix missing manifest in full libri training
2023-01-03 15:40:53 +08:00
Daniil
2fd970b682
not removing result_dir in tedlium conformer ctc2 + add lm stem to compile_hlg_using_openfst.py + add MASTER_ADDR to be prvided to setup_dist (#801) 2023-01-02 08:08:32 +08:00
Zengwei Yao
67ae5fdf2b
Doc streaming zipformer (#798)
* add doc for streaming_zipformer

* update README.md
2022-12-30 15:21:18 +08:00
behnamasefi
a54b748a02
check for utterance len (#795)
Co-authored-by: behnam <basefisaray@roku.com>
2022-12-30 11:06:09 +08:00
Zengwei Yao
d167aad4ab
Add streaming zipformer (#787)
* add streaming zipformer codes

* add test_model.py

* add export.py, pretrained.py, jit_pretrained.py

* add cached_len for pooling module

* add jit_trace_export.py and jit_trace_pretrained.py

* fix bug in jit.trace

* update RESULTS.md

* add CI test

* minor fix in pruned_transducer_stateless7/zipformer.py

* update README.md
2022-12-30 10:52:18 +08:00
marcoyang1998
aa0fe4e4ac
Fix typos in RESULTS.md (#797) 2022-12-29 11:54:42 +08:00
marcoyang1998
1f0408b103
Support Transformer LM (#750)
* support transformer LM

* show number of parameters during training

* update docstring

* testing files for ppl calculation

* add lm wrampper for rnn and transformer LM

* apply lm wrapper in lm shallow fusion

* small updates

* update decode.py to support LM fusion and LODR

* add export.py

* update CI and workflow

* update decoding results

* fix CI

* remove transformer LM from CI test
2022-12-29 10:53:36 +08:00
Yuekai Zhang
3c54333b06
fix bug (#796) 2022-12-28 11:20:38 +08:00
marcoyang1998
05dfd5e630
Fix distillation with HuBERT (#790)
* update vq huggingface url

* remove hard lhotse version requirement

* resolve ID mismatch

* small fixes


* Update egs/librispeech/ASR/pruned_transducer_stateless6/vq_utils.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* update version check

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-12-27 15:26:11 +08:00
Yifan Yang
a24a1cbfa9
small fix for zipformer_ctc_blankskip.rst (#792)
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-27 15:06:53 +08:00
Fangjun Kuang
88b7895adf
fix librispeech.py in multi-dataset setup (#791) 2022-12-27 13:59:55 +08:00
Fangjun Kuang
dfbcf606e7
small fixes to prepare.sh (#789) 2022-12-27 09:25:42 +08:00
Yifan Yang
4e249da2c4
Add zipformer_ctc_blankskip.rst (#784)
* Add zipformer_ctc_blankskip.rst

* typo fix for zipformer_mmi.rst

* fix warning

* Update docs/source/recipes/Non-streaming-ASR/librispeech/zipformer_ctc_blankskip.rst

Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-12-26 14:30:20 +08:00
Yifan Yang
59eb465b3c
optimize frame_reducer.py (#783)
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-23 17:55:36 +08:00
BuaaAlban
7eb2d0edb6
Update train.py (#773)
Fix transducer lstm egs bug as mentioned in issue 579
2022-12-23 11:38:22 +08:00
Yifan Yang
070c77e724
Add Blankskip to Zipformer+CTC (#730)
* init files

* add ctc as auxiliary loss and ctc_decode.py

* tuning the scalar of HLG score for 1best, nbest and nbest-oracle

* rename to pruned_transducer_stateless7_ctc

* fix doc

* fix bug, recover the hlg scores

* modify ctc_decode.py, move out the hlg scale

* fix hlg_scale

* add export.py and pretrained.py, and so on

* upload files, update README.md and RESULTS.md

* add CI test

* update .gitignore

* create symlinks

* Add Blank Skip to Zipformer+CTC

* Add warmup to blank skip

* Add warmup to blank skip

* Add __init__.py

* Add parameters_names to Adam

* Add warmup to blank skip

* Modify frame_reducer

* Modify frame_reducer

* Add Blank Skip to decode.

* Add ctc_decode.py

* Add blank skip to Zipformer+CTC

* process conflict

* process conflict

* modify ctc_guild_decode_bk.py

* modify Lconv

* produce the conflict

* Add export.py

* finish export

* fix for running black

* Add ci test

* Add ci-test

* chmod

* chmod

* fix bug for ci-test

* fix bug for ci-test

* fix bug for ci-test

* rename the dirname

* rename the dirname

* change dirname

* change dirname

* fix notes

* add pretrained.py

* add pretrained.py

* add pretrained.py

* add pretrained.py

* add pretrained.py

* add pretrained.py

* fix

* fix

* fix

* finished

* add the Copyright info and notes

Co-authored-by: Zengwei Yao <yaozengwei@outlook.com>
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-21 17:41:31 +08:00
Zengwei Yao
65d7192dca
Fix zipformer attn_output_weights (#774)
* fix attn_output_weights

* remove in-place op
2022-12-19 20:10:39 +08:00
Zengwei Yao
fbc1d3b194
fix src_key_padding_mask in DownsampledZipformerEncoder (#768) 2022-12-17 22:03:13 +08:00
kobenaxie
6d659f423d
delete duplicate line for encoder initial state (#765) 2022-12-15 20:42:07 +08:00
Wei Kang
ad475ec10d
Add documents for pruned_transducer_stateless (#526)
* begin to add documents for pruned_transducer_stateless

* Move lstm docs to Streaming folder

* Add documents for pruned transducer stateless models

* Move zipformer mmi to non-streaming recipe

* Add more docs for streaming decoding

* Fix typo
2022-12-15 19:07:28 +08:00
Fangjun Kuang
fbc8894804
Add comment for compile_hlg_using_openfst.py (#762) 2022-12-14 13:47:23 +08:00
Daniil
b293db4baf
Tedlium3 conformer ctc2 (#696)
* modify preparation

* small refacor

* add tedlium3 conformer_ctc2

* modify decode

* filter unk in decode

* add scaling converter

* address comments

* fix lambda function lhotse

* add implicit manifest shuffle

* refactor ctc_greedy_search

* import model arguments from train.py

* style fix

* fix ci test and last style issues

* update RESULTS

* fix RESULTS numbers

* fix label smoothing loss

* update model parameters number in RESULTS
2022-12-13 16:13:26 +08:00
Zengwei Yao
0470bbae66
minor fix for zipformer recipe (#758)
* minor fix

* add CI test
2022-12-13 15:47:30 +08:00
Zengwei Yao
b25c234c51
Add Zipformer-MMI (#746)
* Minor fix to conformer-mmi

* Minor fixes

* Fix decode.py

* add training files

* train with ctc warmup

* add pruned_transducer_stateless7_mmi

* add zipformer_mmi/mmi_decode.py, using HP as decoding graph

* add mmi_decode.py

* remove pruned_transducer_stateless7_mmi

* rename zipformer_mmi/train_with_ctc.py as zipformer_mmi/train.py

* remove unused method

* rename mmi_decode.py

* add export.py pretrained.py jit_pretrained.py ...

* add RESULTS.md

* add CI test

* add docs

* add README.md

Co-authored-by: pkufool <wkang.pku@gmail.com>
2022-12-11 21:30:39 +08:00
wzy
e83409cbe5
Filter the training data of T < S for Wenet train recipe (#753)
* filter the case of T <  S  for training data

* fix style issues

* fix style issues

* fix style issues

Co-authored-by: 张云斌 <zhangyunbin@MacBook-Air.local>
2022-12-11 20:16:10 +08:00
Yifan Yang
02c18ba4b2
rm the dup line of Zipformer.py (#755)
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-10 19:34:19 +08:00
Desh Raj
c4aaf3ea3b
Add AliMeeting multi-condition training recipe (#751)
* add AliMeeting multi-domain recipe

* convert scripts to symbolic links
2022-12-10 18:15:23 +08:00