12 Commits

Author SHA1 Message Date
zr_jin
a81396b482
Use tokens.txt to replace bpe.model (#1162) 2023-08-12 16:53:59 +08:00
Fangjun Kuang
f5de2e90c6
Fix style issues. (#937) 2023-03-08 22:56:04 +08:00
pehonnet
07243d136a
remove key from result filename (#936)
Co-authored-by: pe-honnet <pe.honnet@telepathy.ai>
2023-03-08 21:06:07 +08:00
Zengwei Yao
4e832fa6b0
fix reduction conformer_ctc3/train.py (#908) 2023-02-14 20:45:38 +08:00
Zengwei Yao
25ee50e27c
add ctc-greedy-search with timestamps (#905) 2023-02-13 19:45:09 +08:00
Zengwei Yao
af735eb75b
Get alignments using lhotse workflows align-with-torchaudio (#888)
* add lhotse workflow align-with-torchaudio

* modify related decode.py files
2023-02-08 21:54:35 +08:00
Zengwei Yao
d12e6f098c
Get (start, end) timestamps for CTC models (#876)
* parse timestamps and texts for BPE-based models

* parse timestamps (frame indexes) and texts for other cases

* add test functions

* add parse_fsa_timestamps_and_texts function, test in conformer_ctc3/decode.py

* calculate symbol delay for (start, end) timestamps
2023-02-07 21:43:16 +08:00
Zengwei Yao
5a05b95730
add params.hlg_scale (#880) 2023-02-06 23:21:46 +08:00
Zengwei Yao
b25c234c51
Add Zipformer-MMI (#746)
* Minor fix to conformer-mmi

* Minor fixes

* Fix decode.py

* add training files

* train with ctc warmup

* add pruned_transducer_stateless7_mmi

* add zipformer_mmi/mmi_decode.py, using HP as decoding graph

* add mmi_decode.py

* remove pruned_transducer_stateless7_mmi

* rename zipformer_mmi/train_with_ctc.py as zipformer_mmi/train.py

* remove unused method

* rename mmi_decode.py

* add export.py pretrained.py jit_pretrained.py ...

* add RESULTS.md

* add CI test

* add docs

* add README.md

Co-authored-by: pkufool <wkang.pku@gmail.com>
2022-12-11 21:30:39 +08:00
Wei Kang
c25c8c6ad1
Add need_repeat_flag in phone based ctc graph compiler (#727)
* Fix is_repeat_token in icefall

* Fix phone based recipe

* Update egs/librispeech/ASR/conformer_ctc3/train.py

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Fix black

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-12-04 17:20:17 +08:00
Zengwei Yao
8eb4b9d96d
Combining rnnt loss and k2-ctc loss for Dan's Zipformer (#683)
* init files

* add ctc as auxiliary loss and ctc_decode.py

* tuning the scalar of HLG score for 1best, nbest and nbest-oracle

* rename to pruned_transducer_stateless7_ctc

* fix doc

* fix bug, recover the hlg scores

* modify ctc_decode.py, move out the hlg scale

* fix hlg_scale

* add export.py and pretrained.py, and so on

* upload files, update README.md and RESULTS.md

* add CI test
2022-12-03 19:01:10 +08:00
Zengwei Yao
ece728d895
Apply delay penalty on k2 ctc loss (#669)
* add init files

* fix bug, apply delay penalty

* fix decoding code and getting timestamps

* add option applying delay penalty on ctc log-prob

* fix bug of streaming decoding

* minor change for bpe-based case

* add test_model.py

* add README.md

* add CI
2022-11-28 22:34:02 +08:00