17 Commits

Author SHA1 Message Date
zr_jin
a81396b482
Use tokens.txt to replace bpe.model (#1162) 2023-08-12 16:53:59 +08:00
Fangjun Kuang
f5de2e90c6
Fix style issues. (#937) 2023-03-08 22:56:04 +08:00
pehonnet
07243d136a
remove key from result filename (#936)
Co-authored-by: pe-honnet <pe.honnet@telepathy.ai>
2023-03-08 21:06:07 +08:00
Fangjun Kuang
2b995639b7
Add ONNX support for Zipformer and ConvEmformer (#884) 2023-02-09 00:02:38 +08:00
huangruizhe
6693d907d3
shuffle full Librispeech data (#574)
* shuffled full/partial librispeech data

* fixed the code style issue

* Shuffled full librispeech data off-line

* Fixed style, addressed comments, and removed redandunt codes

* Used the suggested version of black

* Propagated the changes to other folders for librispeech (except
conformer_mmi and streaming_conformer_ctc)
2022-11-27 11:26:09 +08:00
Desh Raj
d31db01037 manual correction of black formatting 2022-11-17 14:18:05 -05:00
Desh Raj
107df3b115 apply black on all files 2022-11-17 09:42:17 -05:00
Fangjun Kuang
60317120ca
Revert "Apply new Black style changes" 2022-11-17 20:19:32 +08:00
Desh Raj
d110b04ad3 apply new black formatting to all files 2022-11-16 13:06:43 -05:00
Fangjun Kuang
e334e570d8
Filter utterances with number_tokens > number_feature_frames. (#604) 2022-11-12 07:57:58 +08:00
Zengwei Yao
3600ce1b5f
Apply delay penalty on transducer (#654)
* add delay penalty

* fix CI

* fix CI
2022-11-04 16:10:09 +08:00
Fangjun Kuang
d69bb826ed
Support exporting LSTM with projection to ONNX (#621)
* Support exporting LSTM with projection to ONNX

* Add missing files

* small fixes
2022-10-18 11:25:31 +08:00
Fangjun Kuang
d1f16a04bd
fix type hints for decode.py (#623) 2022-10-18 06:56:12 +08:00
Fangjun Kuang
099cd3a215
support exporting to ncnn format via PNNX (#571) 2022-09-20 22:52:49 +08:00
Fangjun Kuang
97b3fc53aa
Add LSTM for the multi-dataset setup. (#558)
* Add LSTM for the multi-dataset setup.

* Add results

* fix style issues

* add missing file
2022-09-16 18:40:25 +08:00
Fangjun Kuang
0598291ff1
minor fixes to LSTM streaming model (#537) 2022-08-20 09:50:50 +08:00
Zengwei Yao
f2f5baf687
Use ScaledLSTM as streaming encoder (#479)
* add ScaledLSTM

* add RNNEncoderLayer and RNNEncoder classes in lstm.py

* add RNN and Conv2dSubsampling classes in lstm.py

* hardcode bidirectional=False

* link from pruned_transducer_stateless2

* link scaling.py pruned_transducer_stateless2

* copy from pruned_transducer_stateless2

* modify decode.py pretrained.py test_model.py train.py

* copy streaming decoding files from pruned_transducer_stateless2

* modify streaming decoding files

* simplified code in ScaledLSTM

* flat weights after scaling

* pruned2 -> pruned4

* link __init__.py

* fix style

* remove add_model_arguments

* modify .flake8

* fix style

* fix scale value in scaling.py

* add random combiner for training deeper model

* add using proj_size

* add scaling converter for ScaledLSTM

* support jit trace

* add using averaged model in export.py

* modify test_model.py, test if the model can be successfully exported by jit.trace

* modify pretrained.py

* support streaming decoding

* fix model.py

* Add cut_id to recognition results

* Add cut_id to recognition results

* do not pad in Conv subsampling module; add tail padding during decoding.

* update RESULTS.md

* minor fix

* fix doc

* update README.md

* minor change, filter infinite loss

* remove the condition of raise error

* modify type hint for the return value in model.py

* minor change

* modify RESULTS.md

Co-authored-by: pkufool <wkang.pku@gmail.com>
2022-08-19 14:38:45 +08:00