3 Commits

Author SHA1 Message Date
marcoyang1998
57d6482a79
Streaming Zipformer with multi-dataset (#984)
* modify train.py

* add right padding option in decode.py

* update RESULTS.md
2023-04-21 15:43:28 +08:00
Yifan Yang
070c77e724
Add Blankskip to Zipformer+CTC (#730)
* init files

* add ctc as auxiliary loss and ctc_decode.py

* tuning the scalar of HLG score for 1best, nbest and nbest-oracle

* rename to pruned_transducer_stateless7_ctc

* fix doc

* fix bug, recover the hlg scores

* modify ctc_decode.py, move out the hlg scale

* fix hlg_scale

* add export.py and pretrained.py, and so on

* upload files, update README.md and RESULTS.md

* add CI test

* update .gitignore

* create symlinks

* Add Blank Skip to Zipformer+CTC

* Add warmup to blank skip

* Add warmup to blank skip

* Add __init__.py

* Add parameters_names to Adam

* Add warmup to blank skip

* Modify frame_reducer

* Modify frame_reducer

* Add Blank Skip to decode.

* Add ctc_decode.py

* Add blank skip to Zipformer+CTC

* process conflict

* process conflict

* modify ctc_guild_decode_bk.py

* modify Lconv

* produce the conflict

* Add export.py

* finish export

* fix for running black

* Add ci test

* Add ci-test

* chmod

* chmod

* fix bug for ci-test

* fix bug for ci-test

* fix bug for ci-test

* rename the dirname

* rename the dirname

* change dirname

* change dirname

* fix notes

* add pretrained.py

* add pretrained.py

* add pretrained.py

* add pretrained.py

* add pretrained.py

* add pretrained.py

* fix

* fix

* fix

* finished

* add the Copyright info and notes

Co-authored-by: Zengwei Yao <yaozengwei@outlook.com>
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-21 17:41:31 +08:00
Fangjun Kuang
ac84220de9
Modified conformer with multi datasets (#312)
* Copy files for editing.

* Use librispeech + gigaspeech with modified conformer.

* Support specifying number of workers for on-the-fly feature extraction.

* Feature extraction code for GigaSpeech.

* Combine XL splits lazily during training.

* Fix warnings in decoding.

* Add decoding code for GigaSpeech.

* Fix decoding the gigaspeech dataset.

We have to use the decoder/joiner networks for the GigaSpeech dataset.

* Disable speed perturbe for XL subset.

* Compute the Nbest oracle WER for RNN-T decoding.

* Minor fixes.

* Minor fixes.

* Add results.

* Update results.

* Update CI.

* Update results.

* Fix style issues.

* Update results.

* Fix style issues.
2022-04-29 15:40:30 +08:00