146 Commits

Author SHA1 Message Date
Wei Kang
64aed2cdeb
Fix LG log file name (#657) 2022-11-03 23:12:35 +08:00
Wei Kang
163d929601
Add fast_beam_search_LG (#622)
* Add fast_beam_search_LG

* add fast_beam_search_LG to commonly used recipes

* fix ci

* fix ci

* Fix error
2022-11-03 16:29:30 +08:00
Wei Kang
d389524d45
remove tail padding for non-streaming models (#625) 2022-11-01 11:09:56 +08:00
Zengwei Yao
03668771d7
Get timestamps during decoding (#598)
* print out timestamps during decoding

* add word-level alignments

* support to compute mean symbol delay with word-level alignments

* print variance of symbol delay

* update doc

* support to compute delay for pruned_transducer_stateless4

* fix bug

* add doc
2022-11-01 10:24:00 +08:00
ezerhouni
9b671e1c21
Add Shallow fusion in modified_beam_search (#630)
* Add utility for shallow fusion

* test batch size == 1 without shallow fusion

* Use shallow fusion for modified-beam-search

* Modified beam search with ngram rescoring

* Fix code according to review

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-10-21 16:44:56 +08:00
Fangjun Kuang
d1f16a04bd
fix type hints for decode.py (#623) 2022-10-18 06:56:12 +08:00
Zengwei Yao
aa58c2ee02
Modify ActivationBalancer for speed (#612)
* add a probability to apply ActivationBalancer

* minor fix

* minor fix
2022-10-13 15:14:28 +08:00
Fangjun Kuang
1c07d2fb37
Remove all-in-one for onnx export (#614)
* Remove all-in-one for onnx export

* Exit on error for CI
2022-10-12 10:34:06 +08:00
Zengwei Yao
f3ad32777a
Gradient filter for training lstm model (#564)
* init files

* add gradient filter module

* refact getting median value

* add cutoff for grad filter

* delete comments

* apply gradient filter in LSTM module, to filter both input and params

* fix typing and refactor

* filter with soft mask

* rename lstm_transducer_stateless2 to lstm_transducer_stateless3

* fix typos, and update RESULTS.md

* minor fix

* fix return typing

* fix typo
2022-09-29 11:15:43 +08:00
LIyong.Guo
923b60a7c6
padding zeros (#591) 2022-09-28 21:20:33 +08:00
marcoyang1998
1e31fbcd7d
Add clamping operation in Eve optimizer for all scalar weights to avoid (#550)
non stable training in some scenarios. The clamping range is set to (-10,2).
 Note that this change may cause unexpected effect if you resume
training from a model that is trained without clamping.
2022-08-25 12:12:50 +08:00
Zengwei Yao
f2f5baf687
Use ScaledLSTM as streaming encoder (#479)
* add ScaledLSTM

* add RNNEncoderLayer and RNNEncoder classes in lstm.py

* add RNN and Conv2dSubsampling classes in lstm.py

* hardcode bidirectional=False

* link from pruned_transducer_stateless2

* link scaling.py pruned_transducer_stateless2

* copy from pruned_transducer_stateless2

* modify decode.py pretrained.py test_model.py train.py

* copy streaming decoding files from pruned_transducer_stateless2

* modify streaming decoding files

* simplified code in ScaledLSTM

* flat weights after scaling

* pruned2 -> pruned4

* link __init__.py

* fix style

* remove add_model_arguments

* modify .flake8

* fix style

* fix scale value in scaling.py

* add random combiner for training deeper model

* add using proj_size

* add scaling converter for ScaledLSTM

* support jit trace

* add using averaged model in export.py

* modify test_model.py, test if the model can be successfully exported by jit.trace

* modify pretrained.py

* support streaming decoding

* fix model.py

* Add cut_id to recognition results

* Add cut_id to recognition results

* do not pad in Conv subsampling module; add tail padding during decoding.

* update RESULTS.md

* minor fix

* fix doc

* update README.md

* minor change, filter infinite loss

* remove the condition of raise error

* modify type hint for the return value in model.py

* minor change

* modify RESULTS.md

Co-authored-by: pkufool <wkang.pku@gmail.com>
2022-08-19 14:38:45 +08:00
marcoyang1998
c74cec59e9
propagate changes from #525 to other librispeech recipes (#531)
* propaga changes from #525 to other librispeech recipes

* refactor display_and_save_batch to utils

* fixed typo

* reformat code style
2022-08-17 17:18:15 +08:00
Fangjun Kuang
669401869d
Filter non-finite losses (#525)
* Filter non-finite losses

* Fixes after review
2022-08-17 12:22:43 +08:00
Wei Kang
5c17255eec
Sort results to make it more convenient to compare decoding results (#522)
* Sort result to make it more convenient to compare decoding results

* Add cut_id to recognition results

* add cut_id to results for all recipes

* Fix torch.jit.script

* Fix comments

* Minor fixes

* Fix torch.jit.tracing for Pytorch version before v1.9.0
2022-08-12 07:12:50 +08:00
Fangjun Kuang
1f7832b93c
Fix loading sampler state dict. (#421)
* Fix loading sampler state dict.

* skip scan_pessimistic_batches_for_oom if params.start_batch > 0
2022-08-06 10:00:08 +08:00
Fangjun Kuang
6af5a82d8f
Convert ScaledEmbedding to nn.Embedding for inference. (#517)
* Convert ScaledEmbedding to nn.Embedding for inference.

* Fix CI style issues.
2022-08-03 15:34:55 +08:00
Fangjun Kuang
58a96e5b68
Support exporting to ONNX format (#501)
* WIP: Support exporting to ONNX format

* Minor fixes.

* Combine encoder/decoder/joiner into a single file.

* Revert merging three onnx models into a single one.

It's quite time consuming to extract a sub-graph from the combined
model. For instance, it takes more than one hour to extract
the encoder model.

* Update CI to test ONNX models.

* Decode with exported models.

* Fix typos.

* Add more doc.

* Remove ncnn as it is not fully tested yet.

* Fix as_strided for streaming conformer.
2022-08-03 10:30:28 +08:00
Wei Kang
b1d0956855
Add modified_beam_search for streaming decode (#489)
* Add modified_beam_search for pruned_transducer_stateless/streaming_decode.py

* refactor

* modified beam search for stateless3,4

* Fix comments

* Add real streamng ci
2022-07-25 16:53:23 +08:00
Zengwei Yao
8203d10be7
Add stats about duration and padding proportion (#485)
* add stats about duration and padding proportion

* add  for utt_duration

* add stats for other recipes

* add stats for other 2 recipes

* modify doc

* minor change
2022-07-25 16:40:43 +08:00
Quandwang
116d0cf26d
CTC attention model with reworked Conformer encoder and reworked Transformer decoder (#462)
* ctc attention model with reworked conformer encoder and reworked transformer decoder

* remove unnecessary func

* resolve flake8 conflicts

* fix typos and modify the expr of ScaledEmbedding

* use original beam size

* minor changes to the scripts

* add rnn lm decoding

* minor changes

* check whether q k v weight is None

* check whether q k v weight is None

* check whether q k v weight is None

* style correction

* update results

* update results

* upload the decoding results of rnn-lm to the RESULTS

* upload the decoding results of rnn-lm to the RESULTS

* Update egs/librispeech/ASR/RESULTS.md

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Update egs/librispeech/ASR/RESULTS.md

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Update egs/librispeech/ASR/RESULTS.md

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-07-22 15:31:25 +08:00
ezerhouni
608473b4eb
Add RNN-LM rescoring in fast beam search (#475) 2022-07-18 16:52:17 +08:00
ezerhouni
ffca1ae7fb
[WIP] Rnn-T LM nbest rescoring (#471) 2022-07-15 10:32:54 +08:00
Wei Kang
6e609c67a2
Using streaming conformer as transducer encoder (#380)
* support streaming in conformer

* Add more documents

* support streaming on pruned_transducer_stateless2; add delay penalty; fixes for decode states

* Minor fixes

* streaming for pruned_transducer_stateless4

* Fix conv cache error, support async streaming decoding

* Fix style

* Fix style

* Fix style

* Add torch.jit.export

* mask the initial cache

* Cutting off invalid frames of encoder_embed output

* fix relative positional encoding in streaming decoding for compution saving

* Minor fixes

* Minor fixes

* Minor fixes

* Minor fixes

* Minor fixes

* Fix jit export for torch 1.6

* Minor fixes for streaming decoding

* Minor fixes on decode stream

* move model parameters to train.py

* make states in forward streaming optional

* update pretrain to support streaming model

* update results.md

* update tensorboard and pre-models

* fix typo

* Fix tests

* remove unused arguments

* add streaming decoding ci

* Minor fix

* Minor fix

* disable right context by default
2022-06-28 00:18:54 +08:00
Jun Wang
d792bdc9bc
fix typo (#445) 2022-06-25 11:00:53 +08:00
Fangjun Kuang
dc89b61b80
Add fast_beam_search_nbest. (#420)
* Add fast_beam_search_nbest.

* Fix CI errors.

* Fix CI errors.

* More fixes.

* Small fixes.

* Support using log_add in LG decoding with fast_beam_search.

* Support LG decoding in pruned_transducer_stateless

* Support LG for pruned_transducer_stateless2.

* Support LG for fast beam search.

* Minor fixes.
2022-06-22 00:09:25 +08:00
Fangjun Kuang
7100c33820
Add pruned RNN-T for aishell. (#436)
* Add pruned RNN-T for aishell.

* support torch script.

* Update CI.

* Minor fixes.

* Add links to sherpa.
2022-06-21 21:17:22 +08:00
Zengwei Yao
a42d96dfe0
Fix warmup (#435)
* fix warmup when scan_pessimistic_batches_for_oom

* delete comments
2022-06-20 13:40:01 +08:00
Fangjun Kuang
ab788980c9
Fix an error introduced by supporting torchscript for torch 1.6.0 (#434) 2022-06-18 08:57:20 +08:00
Fangjun Kuang
d53f69108f
Support torch 1.6.0 (#433) 2022-06-17 22:24:47 +08:00
Zengwei Yao
53f38c01d2
Emformer with conv module and scaling mechanism (#389)
* copy files from existing branch

* add rule in .flake8

* monir style fix

* fix typos

* add tail padding

* refactor, use fixed-length cache for batch decoding

* copy from streaming branch

* copy from streaming branch

* modify emformer states stack and unstack, streaming decoding, to be continued

* refactor Stream class

* remane streaming_feature_extractor.py

* refactor streaming decoding

* test states stack and unstack

* fix bugs, no grad, and num_proccessed_frames

* add modify_beam_search, fast_beam_search

* support torch.jit.export

* use torch.div

* copy from pruned_transducer_stateless4

* modify export.py

* add author info

* delete other test functions

* minor fix

* modify doc

* fix style

* minor fix doc

* minor fix

* minor fix doc

* update RESULTS.md

* fix typo

* add info

* fix typo

* fix doc

* add test function for conv module, and minor fix.

* add copyright info

* minor change of test_emformer.py

* fix doc of stack and unstack, test case with batch_size=1

* update README.md
2022-06-13 15:09:17 +08:00
Quandwang
8512aaf585
fix typos (#409) 2022-06-08 20:08:44 +08:00
Fangjun Kuang
2f1e23cde1
Narrower and deeper conformer (#330)
* Copy files for editing.

* Add random combine from #229.

* Minor fixes.

* Pass model parameters from the command line.

* Fix warnings.

* Fix warnings.

* Update readme.

* Rename to avoid conflicts.

* Update results.

* Add CI for pruned_transducer_stateless5

* Typo fixes.

* Remove random combiner.

* Update decode.py and train.py to use periodically averaged models.

* Minor fixes.

* Revert to use random combiner.

* Update results.

* Minor fixes.
2022-05-23 14:39:11 +08:00
Daniel Povey
4e23fb2252
Improve diagnostics code memory-wise and accumulate more stats. (#373)
* Update diagnostics, hopefully print more stats.

# Conflicts:
#	egs/librispeech/ASR/pruned_transducer_stateless4b/train.py

* Remove memory-limit options arg

* Remove unnecessary option for diagnostics code, collect on more batches
2022-05-19 11:45:59 +08:00
Fangjun Kuang
f6ce135608
Various fixes to support torch script. (#371)
* Various fixes to support torch script.

* Add tests to ensure that the model is torch scriptable.

* Update tests.
2022-05-16 21:46:59 +08:00
Fangjun Kuang
f23dd43719
Update results for libri+giga multi dataset setup. (#363)
* Update results for libri+giga multi dataset setup.
2022-05-14 21:45:39 +08:00
Fangjun Kuang
7b7acdf369
Support --iter in export.py (#360) 2022-05-13 10:51:44 +08:00
Fangjun Kuang
aeb8986e35
Ignore padding frames during RNN-T decoding. (#358)
* Ignore padding frames during RNN-T decoding.

* Fix outdated decoding code.

* Minor fixes.
2022-05-13 07:39:14 +08:00
Zengwei Yao
c059ef3169
Keep model_avg on cpu (#348)
* keep model_avg on cpu

* explicitly convert model_avg to cpu

* minor fix

* remove device convertion for model_avg

* modify usage of the model device in train.py

* change model.device to next(model.parameters()).device for decoding

* assert params.start_epoch>0

* assert params.start_epoch>0, params.start_epoch
2022-05-07 10:42:34 +08:00
Fangjun Kuang
32f05c00e3
Save batch to disk on exception. (#350) 2022-05-06 17:49:40 +08:00
Fangjun Kuang
e1c3e98980
Save batch to disk on OOM. (#343)
* Save batch to disk on OOM.

* minor fixes

* Fixes after review.

* Fix style issues.
2022-05-05 15:09:23 +08:00
Fangjun Kuang
ac84220de9
Modified conformer with multi datasets (#312)
* Copy files for editing.

* Use librispeech + gigaspeech with modified conformer.

* Support specifying number of workers for on-the-fly feature extraction.

* Feature extraction code for GigaSpeech.

* Combine XL splits lazily during training.

* Fix warnings in decoding.

* Add decoding code for GigaSpeech.

* Fix decoding the gigaspeech dataset.

We have to use the decoder/joiner networks for the GigaSpeech dataset.

* Disable speed perturbe for XL subset.

* Compute the Nbest oracle WER for RNN-T decoding.

* Minor fixes.

* Minor fixes.

* Add results.

* Update results.

* Update CI.

* Update results.

* Fix style issues.

* Update results.

* Fix style issues.
2022-04-29 15:40:30 +08:00
Fangjun Kuang
caab6cfd92
Support specifying iteration number of checkpoints for decoding. (#336)
See also #289
2022-04-28 14:09:22 +08:00
pehonnet
9a98e6ced6
fix fp16 option in example usage (#332) 2022-04-25 18:51:53 +08:00
Guo Liyong
78418ac37c fix comments 2022-04-13 13:09:24 +08:00
Daniel Povey
2a854f5607
Merge pull request #309 from danpovey/update_results
Update results; will further update this before merge
2022-04-12 12:22:48 +08:00
Mingshuang Luo
93c60a9d30
Code style check for librispeech pruned transducer stateless2 (#308) 2022-04-11 22:15:18 +08:00
Daniel Povey
ead822477c Fix rebase 2022-04-11 21:01:13 +08:00
Daniel Povey
e8eb0b94d9 Updating RESULTS.md; fix in beam_search.py 2022-04-11 21:00:11 +08:00
pkufool
a92133ef96 Minor fixes 2022-04-11 20:58:47 +08:00