680 Commits

Author SHA1 Message Date
Fangjun Kuang
cedf9aa24f
Fix shallow fusion and add CI tests for it (#676)
* Fix shallow fusion and add CI tests for it

* Fix -1 index in embedding introduced in the zipformer PR
2022-11-13 11:51:00 +08:00
Fangjun Kuang
7e82f87126
Add Zipformer from Dan (#672) 2022-11-12 18:11:19 +08:00
Fangjun Kuang
e334e570d8
Filter utterances with number_tokens > number_feature_frames. (#604) 2022-11-12 07:57:58 +08:00
Yuekai Zhang
2f43e4508b
fix mask errors when padding audios (#670) 2022-11-10 22:28:04 +08:00
Zengwei Yao
32de2766d5
Refactor getting timestamps in fsa-based decoding (#660)
* refactor getting timestamps for fsa-based decoding

* fix doc

* fix bug
2022-11-05 22:36:06 +08:00
Zengwei Yao
3600ce1b5f
Apply delay penalty on transducer (#654)
* add delay penalty

* fix CI

* fix CI
2022-11-04 16:10:09 +08:00
marcoyang1998
65b85b732c
Merge pull request #659 from marcoyang1998/master
Remove testing file
2022-11-04 12:29:55 +08:00
marcoyang1998
35b884bae6
Merge branch 'k2-fsa:master' into master 2022-11-04 12:28:49 +08:00
marcoyang
2271c3d396 remove testing file 2022-11-04 12:26:38 +08:00
marcoyang1998
7c50a019b1
Merge pull request #645 from marcoyang1998/master
Support RNNLM shallow fusion in modified beam search
2022-11-04 11:39:12 +08:00
marcoyang
a2d7095c1c resolve conflicts 2022-11-04 11:37:42 +08:00
marcoyang
b3c61b85e3 minor fixes 2022-11-04 11:32:09 +08:00
marcoyang
bdaeaae1ae resolve conflicts 2022-11-04 11:25:10 +08:00
marcoyang
0df597291f resolve conflict with timestamp feature 2022-11-04 11:17:56 +08:00
Wei Kang
64aed2cdeb
Fix LG log file name (#657) 2022-11-03 23:12:35 +08:00
Wei Kang
163d929601
Add fast_beam_search_LG (#622)
* Add fast_beam_search_LG

* add fast_beam_search_LG to commonly used recipes

* fix ci

* fix ci

* Fix error
2022-11-03 16:29:30 +08:00
marcoyang
f45d9c4383 resolve conflicts 2022-11-03 11:12:49 +08:00
marcoyang
2a52b8c125 update docs 2022-11-03 11:10:21 +08:00
Teo Wen Shen
d2a1c65c5c
fix torchaudio version in dockerfile (#653)
* fix torchaudio version in dockerfile

* remove kaldiio
2022-11-03 10:27:18 +08:00
zr_jin
5d285625cf
Update tdnn_lstm_ctc.rst (#648) 2022-11-02 23:37:01 +08:00
zr_jin
04671b44f8
Update README.md (#649) 2022-11-02 23:36:40 +08:00
zr_jin
8f79f6de00
Update tdnn_lstm_ctc.rst (#647) 2022-11-02 23:36:07 +08:00
marcoyang1998
e3f218b62b
Update egs/librispeech/ASR/lstm_transducer_stateless2/decode.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-11-02 22:10:23 +08:00
marcoyang
b62fd917ae remove redundant test lines 2022-11-02 18:17:05 +08:00
marcoyang
fb45b95c90 minor fixes 2022-11-02 18:11:39 +08:00
marcoyang
9a01b9098d include previous added decoding method 2022-11-02 18:03:56 +08:00
marcoyang
6c8d1f9ef5 update 2022-11-02 17:48:58 +08:00
marcoyang
babcfd4b68 update author info 2022-11-02 17:27:31 +08:00
marcoyang
0a46a39e24 update decoding commands 2022-11-02 17:25:31 +08:00
marcoyang
86662f0b97 update results 2022-11-02 17:24:53 +08:00
marcoyang
63d0a52dbd support RNNLM shallow fusion in stateless5 2022-11-02 16:37:29 +08:00
marcoyang
de2f5e3e6d support RNNLM shallow fusion for LSTM transducer 2022-11-02 16:15:56 +08:00
Wei Kang
d389524d45
remove tail padding for non-streaming models (#625) 2022-11-01 11:09:56 +08:00
Zengwei Yao
03668771d7
Get timestamps during decoding (#598)
* print out timestamps during decoding

* add word-level alignments

* support to compute mean symbol delay with word-level alignments

* print variance of symbol delay

* update doc

* support to compute delay for pruned_transducer_stateless4

* fix bug

* add doc
2022-11-01 10:24:00 +08:00
Fangjun Kuang
ff3f026381
Checkout the LM for aishell explicitly (#642) 2022-10-31 19:47:43 +08:00
Fangjun Kuang
7f1c0e07b6
Remove onnx and onnxruntime from requirements.txt (#640)
* Remove onnx and onnxruntime from requirements.txt
2022-10-31 13:44:40 +08:00
Teo Wen Shen
1abf2863bb
fix typos (#639) 2022-10-30 22:47:21 +08:00
Wei Kang
581d0361cc
Fix type hints for decode.py (#638)
* Fix type hints for decode.py

* Fix flake8
2022-10-30 16:35:30 +08:00
Nagendra Goel
6709bf1e63
Update train.py (#635)
Add the missing step to add the arguments to the parser.
2022-10-28 10:23:32 +08:00
Fangjun Kuang
499ac24ecb
Install kaldifst for GitHub actions (#632)
* Install kaldifst for GitHub actions
2022-10-24 15:07:29 +08:00
Fangjun Kuang
348494888d
Add kaldifst to requirements.txt (#631) 2022-10-22 13:14:44 +08:00
ezerhouni
9b671e1c21
Add Shallow fusion in modified_beam_search (#630)
* Add utility for shallow fusion

* test batch size == 1 without shallow fusion

* Use shallow fusion for modified-beam-search

* Modified beam search with ngram rescoring

* Fix code according to review

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-10-21 16:44:56 +08:00
marcoyang1998
c30b8d3a1c
fix number of parameters in RESULTS.md (#627) 2022-10-19 16:53:29 +08:00
Teo Wen Shen
15c1a4a441
CSJ Data Preparation (#617)
* workspace setup

* csj prepare done

* Change compute_fbank_musan.py t soft link

* add description

* change lhotse prepare csj command

* split train-dev here

* Add header

* remove debug

* save manifest_statistics

* generate transcript in Lhotse

* update comments in config file
2022-10-18 15:56:43 +08:00
Fangjun Kuang
d69bb826ed
Support exporting LSTM with projection to ONNX (#621)
* Support exporting LSTM with projection to ONNX

* Add missing files

* small fixes
2022-10-18 11:25:31 +08:00
Fangjun Kuang
d1f16a04bd
fix type hints for decode.py (#623) 2022-10-18 06:56:12 +08:00
Fangjun Kuang
a66e74b92f
Fix links in the doc (#619) 2022-10-14 12:23:47 +08:00
Fangjun Kuang
11bff57586
Add doc about model export (#618)
* Add doc about model export

* fix typos
2022-10-14 10:16:34 +08:00
Fangjun Kuang
c39cba5191
Support exporting to ONNX for the wenetspeech recipe (#615)
* Support exporting to ONNX for the wenetspeech recipe
2022-10-13 15:17:20 +08:00
Zengwei Yao
aa58c2ee02
Modify ActivationBalancer for speed (#612)
* add a probability to apply ActivationBalancer

* minor fix

* minor fix
2022-10-13 15:14:28 +08:00