* add stats about duration and padding proportion
* add for utt_duration
* add stats for other recipes
* add stats for other 2 recipes
* modify doc
* minor change
* support streaming in conformer
* Add more documents
* support streaming on pruned_transducer_stateless2; add delay penalty; fixes for decode states
* Minor fixes
* streaming for pruned_transducer_stateless4
* Fix conv cache error, support async streaming decoding
* Fix style
* Fix style
* Fix style
* Add torch.jit.export
* mask the initial cache
* Cutting off invalid frames of encoder_embed output
* fix relative positional encoding in streaming decoding for compution saving
* Minor fixes
* Minor fixes
* Minor fixes
* Minor fixes
* Minor fixes
* Fix jit export for torch 1.6
* Minor fixes for streaming decoding
* Minor fixes on decode stream
* move model parameters to train.py
* make states in forward streaming optional
* update pretrain to support streaming model
* update results.md
* update tensorboard and pre-models
* fix typo
* Fix tests
* remove unused arguments
* add streaming decoding ci
* Minor fix
* Minor fix
* disable right context by default
* keep model_avg on cpu
* explicitly convert model_avg to cpu
* minor fix
* remove device convertion for model_avg
* modify usage of the model device in train.py
* change model.device to next(model.parameters()).device for decoding
* assert params.start_epoch>0
* assert params.start_epoch>0, params.start_epoch
* First upload of model average codes.
* minor fix
* update decode file
* update .flake8
* rename pruned_transducer_stateless3 to pruned_transducer_stateless4
* change epoch number counter starting from 1 instead of 0
* minor fix of pruned_transducer_stateless4/train.py
* refactor the checkpoint.py
* minor fix, update docs, and modify the epoch number to count from 1 in the pruned_transducer_stateless4/decode.py
* update author info
* add docs of the scaling in function average_checkpoints_with_averaged_model