836 Commits

Author SHA1 Message Date
Peter Ross
af8907e1ec
Update pre-commit isort package to v5.11.5 (#1095) 2023-05-24 19:57:37 +08:00
Zengwei Yao
6826b076d4
add flops profiler, support for Zipformer encoder and Conformer encoder (#1093)
* add flops profiler, support for Zipformer encoder and Conformer encoder

* support for reworked conformer and old zipformer

* skip black check
2023-05-24 19:10:45 +08:00
Fangjun Kuang
1df71a6b38
add onnx export for stateless2 (#1086) 2023-05-23 16:11:00 +08:00
Fangjun Kuang
ea8b15309f
Add onnx export scripts for wenetspeech recipe. (#1085) 2023-05-23 13:32:14 +08:00
Fangjun Kuang
dbcf0b41db
Fix stateless7 training error (#1082) 2023-05-23 12:52:02 +08:00
marcoyang1998
585e7b224f
Aishell pruned_transducer_stateless7 (#962)
* Add pruned_transducer_stateless7 for Aishell

* update README.md

* update comments and small fixes
2023-05-23 11:04:33 +08:00
Yifan Yang
7c4ff66a3d
Fix yesno Cl test (#1078) 2023-05-22 12:46:43 +08:00
Yifan Yang
90c392b7b3
Add docs for Fine-tune with mux (#1074)
* Update RESULTS.md
2023-05-22 12:39:51 +08:00
Fangjun Kuang
3883e362ad
Fix yesno CI test (#1077) 2023-05-22 12:29:51 +08:00
Zengwei Yao
8070258ec5
fix conv_emformer2, when using right_context_length=0 (#1076) 2023-05-21 20:31:54 +08:00
Zengwei Yao
30fcd16c7d
rm zipformer/__init__.py (#1075) 2023-05-20 23:12:11 +08:00
Zengwei Yao
a7e142b7ff
Support long audios recognition (#980)
* support long file transcription

* rename recipe as long_file_recog

* add docs

* support multi-gpu decoding

* style fix
2023-05-19 20:27:55 +08:00
Zengwei Yao
f18b539fbc
Add the upgraded Zipformer model (#1058)
* add the zipformer codes, copied from branch from_dan_scaled_adam_exp1119

* support model export with torch.jit.script

* update RESULTS.md

* support exporting streaming model with torch.jit.script

* add results of streaming models, with some minor changes

* update README.md

* add CI test

* update k2 version in requirements-ci.txt

* update pyproject.toml
2023-05-19 16:47:59 +08:00
Fangjun Kuang
a5bbfc6f7e
Update doc for exporting to ncnn (#1072) 2023-05-19 16:22:08 +08:00
Fangjun Kuang
ae1949ddcc
Support using the latest master from tencent/ncnn (#1070)
* Support using the latest master from tencent/ncnn

* small fixes
2023-05-18 20:56:58 +08:00
Yifan Yang
562bda91e4
Add adaption recipe for pruned_transducer_stateless7 (#1059)
* Add mux for finetune

* Add comments

* Fix for black

* Update finetune.py
2023-05-17 16:02:27 +08:00
Wei Kang
bccd20d978
Traning with byte level BPE (TAL_CSASR) (#1033)
* Add byte level bpe tal_csasr recipe

* Minor fixes to decoding and exporting

* Fix prepare.sh

* Update results
2023-05-16 12:44:52 +08:00
tomato18463
7a9f40aac5
Update the yesno recipe logs in doc (#1060) 2023-05-15 11:16:53 +08:00
arbs-gpu
30bde4b788
fix rnn_lm/train.py usage (#1055) 2023-05-11 17:37:47 +08:00
PF Luo
44d016e4a7
export score_token interface for onnx-runtime (#1050) 2023-05-10 22:41:07 +08:00
Fangjun Kuang
6c326427a0
Support exporting streaming conformer to ONNX (#1047) 2023-05-10 14:47:37 +08:00
Fangjun Kuang
86b0db6eb9
update installation doc (#1049) 2023-05-09 16:13:21 +08:00
Fangjun Kuang
5b50ffda54
support using mini librispeech in training (#1048)
* support mini librispeech in training

* update onnx export doc
2023-05-09 15:10:06 +08:00
Fangjun Kuang
ebbab37776
Fix broken code in download_lm.py (#1046) 2023-05-08 20:48:17 +08:00
Peter Ross
62c9dd9703
make egs/timit work according to the documentation (#1044)
* prepare.sh: restore working directory after git lfs pull
* set execute permisons on python scripts called by prepare.sh
2023-05-08 19:07:40 +08:00
Yifan Yang
24b50a5bad
Update README.md (#1043)
* Update README.md
2023-05-08 16:59:05 +08:00
Fangjun Kuang
efbb577b88
fix compiling HLG (#1039) 2023-05-07 16:26:13 +08:00
Yifan Yang
98569b2607
Update RESULTS.md (#1036)
* Update RESULTS.md
2023-05-06 17:51:55 +08:00
Wei Kang
80156dda09
Training with byte level BPE (AIShell) (#986)
* copy files from zipformer librispeech

* Add byte bpe training for aishell

* compile LG graph

* Support LG decoding

* Minor fixes

* black

* Minor fixes

* export & fix pretrain.py

* fix black

* Update RESULTS.md

* Fix export.py
2023-05-04 19:16:17 +08:00
PF Luo
61ec3a7a8f
fix export RNNLM onnx model typo (#1029) 2023-04-28 19:53:06 +08:00
Yuanhang Zhang
b0228c536e
Fix typo in librispeech OpenFST-based HLG preparation script (#1028) 2023-04-28 19:52:32 +08:00
PF Luo
298ed4520f
add meta-data embedding_dim to RNNLM onnx-model (#1026) 2023-04-28 16:33:46 +08:00
Fangjun Kuang
2767b9ff11
Support exporting RNNLM to ONNX. (#1014)
* Support exporting RNNLM to ONNX.

* add int8 models

* fix style issues

* Fix EOS padding

* support exporting for streaming ASR
2023-04-27 14:36:36 +08:00
marcoyang1998
45c13e90e4
RNNLM rescore + Low-order density ratio (#1017)
* add rnnlm rescore + LODR

* add LODR in decode.py

* update RESULTS
2023-04-24 15:00:02 +08:00
Yifan Yang
2096e69bda
Use CutSet.mux for multidataset (#1020)
* Use CutSet.mux

* Remove mischange

* Fix for style check
2023-04-23 18:41:44 +08:00
Yifan Yang
d67a49afe4
Add multidataset (#1010)
* Add Common Voice for multidataset

* Add prepare_multidataset.sh

* Add dataset mixing


* Update prepare_multidataset.sh

* Update prepare_giga_speech.sh

* update comments

* Add split and shuffle mechanism

* Add multi-dataset train

* Fix for deleting

* Fix for modifying

* Add comments

* Change type for perturb_speed

* Fix for style check

* Small fix

* Add filter

* Remove warning
2023-04-21 18:09:41 +08:00
marcoyang1998
57d6482a79
Streaming Zipformer with multi-dataset (#984)
* modify train.py

* add right padding option in decode.py

* update RESULTS.md
2023-04-21 15:43:28 +08:00
Wei Kang
0efed1cec5
Fix path in aishell rnnlm training (#1016) 2023-04-20 23:09:31 +08:00
Wei Kang
5c65516e05
Fix aishell rnnlm training command (#1015) 2023-04-20 16:14:16 +08:00
Yifan Yang
81d386ef3e
Add compute_ppl.py and ngram_entropy_pruning.py (#1013) 2023-04-20 12:27:43 +08:00
Wen Ding
78b9dcc936
Support exporting BS Zipformer models to ONNX, used in Triton Server (#1008)
* Support export BS Zipformer models to ONNX in Tritron

* Update copyright

* Update exporting codes for BS zipformer models

* Code format

* Update comments

* Update export_onnx.py

---------

Co-authored-by: Yifan Yang <64255737+yfyeung@users.noreply.github.com>
2023-04-18 17:05:08 +08:00
Yifan Yang
05e7435d0d
Move soft links into proper position (#1007) 2023-04-18 10:11:12 +08:00
Yifan Yang
8838fe0bd2
Zipformer for Common Voice (#997)
* Add soft links in pruned_transducer_stateless7 for CommonVoice

* Add python files

* Update prepare.sh

* Update normalization

* Fix for soft links

* Add some docs

* Add export

* Update egs/commonvoice/ASR/RESULTS.md

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Add export for onnx

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-04-17 17:47:25 +08:00
marcoyang1998
34d1b07c3d
Modified beam search with RNNLM rescoring (#1002)
* add RNNLM rescore

* add shallow fusion and lm rescore for streaming zipformer

* minor fix

* update RESULTS.md

* fix yesno workflow, change from ubuntu-18.04 to ubuntu-latest
2023-04-17 16:43:00 +08:00
Fangjun Kuang
e32658e620
Fix torch.jit.script() export for streaming zipformer. (#1005) 2023-04-17 16:13:30 +08:00
Zengwei Yao
7c7d9ab042
add @torch.jit.export for streaming_forward func in Zipformer class (#1004) 2023-04-17 12:03:52 +08:00
Zengwei Yao
5f066d3d53
support decoding and computing RTF on test sets with onnx models (#995)
* support decode and compute RTF on test sets with onnx models

* support onnx export and decode in pruned_transducer_stateless
2023-04-12 19:04:50 +08:00
Yifan Yang
dbf2aa3212
Create preprocess_commonvoice.py (#996) 2023-04-11 21:04:54 +08:00
Yifan Yang
3cb0a0121b
Add Common Voice (#994)
* Add commonvoice

* Add data preparation recipe

* Updata

* update prepare.sh

* Fix for black

* Update prefix with cv-

* 20 ->

* Update compute_fbank_commonvoice_dev_test.py

* Update prepare.sh

* Update compute_fbank_commonvoice_dev_test.py
2023-04-11 20:56:40 +08:00
Yifan Yang
33578cca48
Fix filter_cuts in compute_fbank_librispeech.py (#993) 2023-04-11 11:12:05 +08:00