kobenaxie
9a9c5a0f9b
remove unused codes. ( #821 )
2023-01-06 11:16:22 +08:00
Yifan Yang
b9626f2e06
fix typo for ctc-decode.py ( #815 )
...
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2023-01-05 17:18:43 +08:00
Fangjun Kuang
8642dbc0bd
Fix setup_dist ( #806 )
2023-01-04 12:21:19 +08:00
Yunusemre
0f26edfde9
Add Zipformer Onnx Support ( #778 )
...
* add export script
* add zipformer onnx pretrained script
* add onnx zipformer test
* fix style
* add zipformer onnx to workflow
* replace is_in_onnx_export with is_tracing
* add github.event.label.name == 'onnx'
* add is_tracing to necessary conditions
* fix pooling_mask
* add onnx_check
* add onnx_check to scripts
* add is_tracing to scaling.py
2023-01-03 16:59:44 +08:00
marcoyang1998
80cce141b4
Full libri fix manifest ( #804 )
...
* modify the name of the directory of vq manifest
* fix missing manifest in full libri training
2023-01-03 15:40:53 +08:00
Daniil
2fd970b682
not removing result_dir in tedlium conformer ctc2 + add lm stem to compile_hlg_using_openfst.py + add MASTER_ADDR to be prvided to setup_dist ( #801 )
2023-01-02 08:08:32 +08:00
Zengwei Yao
67ae5fdf2b
Doc streaming zipformer ( #798 )
...
* add doc for streaming_zipformer
* update README.md
2022-12-30 15:21:18 +08:00
behnamasefi
a54b748a02
check for utterance len ( #795 )
...
Co-authored-by: behnam <basefisaray@roku.com>
2022-12-30 11:06:09 +08:00
Zengwei Yao
d167aad4ab
Add streaming zipformer ( #787 )
...
* add streaming zipformer codes
* add test_model.py
* add export.py, pretrained.py, jit_pretrained.py
* add cached_len for pooling module
* add jit_trace_export.py and jit_trace_pretrained.py
* fix bug in jit.trace
* update RESULTS.md
* add CI test
* minor fix in pruned_transducer_stateless7/zipformer.py
* update README.md
2022-12-30 10:52:18 +08:00
marcoyang1998
aa0fe4e4ac
Fix typos in RESULTS.md ( #797 )
2022-12-29 11:54:42 +08:00
marcoyang1998
1f0408b103
Support Transformer LM ( #750 )
...
* support transformer LM
* show number of parameters during training
* update docstring
* testing files for ppl calculation
* add lm wrampper for rnn and transformer LM
* apply lm wrapper in lm shallow fusion
* small updates
* update decode.py to support LM fusion and LODR
* add export.py
* update CI and workflow
* update decoding results
* fix CI
* remove transformer LM from CI test
2022-12-29 10:53:36 +08:00
Yuekai Zhang
3c54333b06
fix bug ( #796 )
2022-12-28 11:20:38 +08:00
marcoyang1998
05dfd5e630
Fix distillation with HuBERT ( #790 )
...
* update vq huggingface url
* remove hard lhotse version requirement
* resolve ID mismatch
* small fixes
* Update egs/librispeech/ASR/pruned_transducer_stateless6/vq_utils.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* update version check
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-12-27 15:26:11 +08:00
Yifan Yang
a24a1cbfa9
small fix for zipformer_ctc_blankskip.rst ( #792 )
...
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-27 15:06:53 +08:00
Fangjun Kuang
88b7895adf
fix librispeech.py in multi-dataset setup ( #791 )
2022-12-27 13:59:55 +08:00
Fangjun Kuang
dfbcf606e7
small fixes to prepare.sh ( #789 )
2022-12-27 09:25:42 +08:00
Yifan Yang
4e249da2c4
Add zipformer_ctc_blankskip.rst ( #784 )
...
* Add zipformer_ctc_blankskip.rst
* typo fix for zipformer_mmi.rst
* fix warning
* Update docs/source/recipes/Non-streaming-ASR/librispeech/zipformer_ctc_blankskip.rst
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-12-26 14:30:20 +08:00
Yifan Yang
59eb465b3c
optimize frame_reducer.py ( #783 )
...
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-23 17:55:36 +08:00
BuaaAlban
7eb2d0edb6
Update train.py ( #773 )
...
Fix transducer lstm egs bug as mentioned in issue 579
2022-12-23 11:38:22 +08:00
Yifan Yang
070c77e724
Add Blankskip to Zipformer+CTC ( #730 )
...
* init files
* add ctc as auxiliary loss and ctc_decode.py
* tuning the scalar of HLG score for 1best, nbest and nbest-oracle
* rename to pruned_transducer_stateless7_ctc
* fix doc
* fix bug, recover the hlg scores
* modify ctc_decode.py, move out the hlg scale
* fix hlg_scale
* add export.py and pretrained.py, and so on
* upload files, update README.md and RESULTS.md
* add CI test
* update .gitignore
* create symlinks
* Add Blank Skip to Zipformer+CTC
* Add warmup to blank skip
* Add warmup to blank skip
* Add __init__.py
* Add parameters_names to Adam
* Add warmup to blank skip
* Modify frame_reducer
* Modify frame_reducer
* Add Blank Skip to decode.
* Add ctc_decode.py
* Add blank skip to Zipformer+CTC
* process conflict
* process conflict
* modify ctc_guild_decode_bk.py
* modify Lconv
* produce the conflict
* Add export.py
* finish export
* fix for running black
* Add ci test
* Add ci-test
* chmod
* chmod
* fix bug for ci-test
* fix bug for ci-test
* fix bug for ci-test
* rename the dirname
* rename the dirname
* change dirname
* change dirname
* fix notes
* add pretrained.py
* add pretrained.py
* add pretrained.py
* add pretrained.py
* add pretrained.py
* add pretrained.py
* fix
* fix
* fix
* finished
* add the Copyright info and notes
Co-authored-by: Zengwei Yao <yaozengwei@outlook.com>
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-21 17:41:31 +08:00
Zengwei Yao
65d7192dca
Fix zipformer attn_output_weights ( #774 )
...
* fix attn_output_weights
* remove in-place op
2022-12-19 20:10:39 +08:00
Zengwei Yao
fbc1d3b194
fix src_key_padding_mask in DownsampledZipformerEncoder ( #768 )
2022-12-17 22:03:13 +08:00
kobenaxie
6d659f423d
delete duplicate line for encoder initial state ( #765 )
2022-12-15 20:42:07 +08:00
Wei Kang
ad475ec10d
Add documents for pruned_transducer_stateless ( #526 )
...
* begin to add documents for pruned_transducer_stateless
* Move lstm docs to Streaming folder
* Add documents for pruned transducer stateless models
* Move zipformer mmi to non-streaming recipe
* Add more docs for streaming decoding
* Fix typo
2022-12-15 19:07:28 +08:00
Fangjun Kuang
fbc8894804
Add comment for compile_hlg_using_openfst.py ( #762 )
2022-12-14 13:47:23 +08:00
Daniil
b293db4baf
Tedlium3 conformer ctc2 ( #696 )
...
* modify preparation
* small refacor
* add tedlium3 conformer_ctc2
* modify decode
* filter unk in decode
* add scaling converter
* address comments
* fix lambda function lhotse
* add implicit manifest shuffle
* refactor ctc_greedy_search
* import model arguments from train.py
* style fix
* fix ci test and last style issues
* update RESULTS
* fix RESULTS numbers
* fix label smoothing loss
* update model parameters number in RESULTS
2022-12-13 16:13:26 +08:00
Zengwei Yao
0470bbae66
minor fix for zipformer recipe ( #758 )
...
* minor fix
* add CI test
2022-12-13 15:47:30 +08:00
Zengwei Yao
b25c234c51
Add Zipformer-MMI ( #746 )
...
* Minor fix to conformer-mmi
* Minor fixes
* Fix decode.py
* add training files
* train with ctc warmup
* add pruned_transducer_stateless7_mmi
* add zipformer_mmi/mmi_decode.py, using HP as decoding graph
* add mmi_decode.py
* remove pruned_transducer_stateless7_mmi
* rename zipformer_mmi/train_with_ctc.py as zipformer_mmi/train.py
* remove unused method
* rename mmi_decode.py
* add export.py pretrained.py jit_pretrained.py ...
* add RESULTS.md
* add CI test
* add docs
* add README.md
Co-authored-by: pkufool <wkang.pku@gmail.com>
2022-12-11 21:30:39 +08:00
wzy
e83409cbe5
Filter the training data of T < S for Wenet train recipe ( #753 )
...
* filter the case of T < S for training data
* fix style issues
* fix style issues
* fix style issues
Co-authored-by: 张云斌 <zhangyunbin@MacBook-Air.local>
2022-12-11 20:16:10 +08:00
Yifan Yang
02c18ba4b2
rm the dup line of Zipformer.py ( #755 )
...
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-10 19:34:19 +08:00
Desh Raj
c4aaf3ea3b
Add AliMeeting multi-condition training recipe ( #751 )
...
* add AliMeeting multi-domain recipe
* convert scripts to symbolic links
2022-12-10 18:15:23 +08:00
Yifan Yang
a0cf85343d
fix for memory usage in pruned_transducer_stateless7/scaling.py ( #752 )
...
Co-authored-by: yifanyang <yifanyeung@yifanyangs-MacBook-Pro.local>
2022-12-09 19:23:11 +08:00
Fangjun Kuang
4501821fd9
Support using OpenFst to compile HLG. ( #606 )
...
* Support using OpenFst to compile HLG.
* Fix style issues
2022-12-09 16:46:44 +08:00
armusc
d65fe17d27
Update train.py with parameters_names as required by optimizer initialization ( #742 )
...
* Update train.py
2022-12-08 20:21:51 +08:00
huangruizhe
0e325c8782
Fixed rnn_lm model.py ( #738 )
2022-12-07 15:43:26 +08:00
Ali Haznedaroğlu
10472e7ffc
Update prepare.sh ( #737 )
2022-12-07 08:22:50 +08:00
Fangjun Kuang
f13cf61b05
Convert conv-emformer to ncnn ( #717 )
...
* Export conv-emformer via torch.jit.trace()
2022-12-06 16:34:27 +08:00
Cesc
be6e08f69a
fix wenet stateless5 jit export error ( #735 )
2022-12-05 23:35:10 +08:00
Fangjun Kuang
bd7fa2253d
Update the manifest statistics of the L subset of wenetspeech ( #731 )
2022-12-04 20:27:45 +08:00
Wei Kang
c25c8c6ad1
Add need_repeat_flag in phone based ctc graph compiler ( #727 )
...
* Fix is_repeat_token in icefall
* Fix phone based recipe
* Update egs/librispeech/ASR/conformer_ctc3/train.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Fix black
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-12-04 17:20:17 +08:00
Senyan Li
e6a6727012
Add Tibetan Amdo dialect xbmu_amdo31 in egs ( #706 )
...
* add egs/xbmu_amdo31
* fix xbmu_amdo31/ASR/pruned_transducer_stateless5/train.py
* fix xbmu_amdo31/ASR/pruned_transducer_stateless5/asr_datamodule.py
* fix xbmu_amdo31/ASR/prepare.sh
* add RESULTS.md and README.md
* dix pruned_transducer_stateless5 decode.py
* add transducer stateless7
* fix transducer_stateless7
* fix RESULTS.md error
* Add pruned_transducer_stateless7 validation set results
2022-12-03 23:50:49 +08:00
Zengwei Yao
8eb4b9d96d
Combining rnnt loss and k2-ctc loss for Dan's Zipformer ( #683 )
...
* init files
* add ctc as auxiliary loss and ctc_decode.py
* tuning the scalar of HLG score for 1best, nbest and nbest-oracle
* rename to pruned_transducer_stateless7_ctc
* fix doc
* fix bug, recover the hlg scores
* modify ctc_decode.py, move out the hlg scale
* fix hlg_scale
* add export.py and pretrained.py, and so on
* upload files, update README.md and RESULTS.md
* add CI test
2022-12-03 19:01:10 +08:00
Weiji Zhuang
7700ddcb38
update multidataset zipformer results ( #728 )
2022-12-02 17:40:42 +08:00
Amir Hussein
6f71981667
MGB2 ( #396 )
...
* mgb2
* mgb2
* adding pruned transducer stateless to mgb2
* update display_manifest_statistics.py
* .
* stateless transducer MGB-2
* Update README.md
* Update RESULTS.md
* Update prepare_lang_bpe.py
* Update asr_datamodule.py
* .nfs removed
* Adding symlink
* .
* resolving conflicts
* Update .gitignore
* black formatting
* Update compile_hlg.py
* Update compute_fbank_musan.py
* Update convert_transcript_words_to_tokens.py
* Update download_lm.py
* Update generate_unique_lexicon.py
* adding simlinks
* fixing symbolic links
2022-12-02 10:58:34 +08:00
Fangjun Kuang
6533f359c9
Fix CI ( #726 )
...
* Fix CI
* Disable shuffle for yesno.
See https://github.com/k2-fsa/icefall/issues/197
2022-12-02 10:53:06 +08:00
Fangjun Kuang
04c9fc9c9f
Fix for older versions of k2 ( #725 )
2022-12-02 09:18:28 +08:00
Fangjun Kuang
2bca7032af
Update RNNLM training scripts ( #720 )
...
* Update RNNLM training scripts
* Fix a typo
* Fix CI
2022-12-01 15:57:43 +08:00
Fangjun Kuang
556c63fbb7
Describe how to fix segfault in doc ( #719 )
2022-12-01 08:58:18 +08:00
marcoyang1998
4b5bc480e8
Add low-order density ratio in RNNLM shallow fusion ( #678 )
...
* Support LODR in RNNLM shallow fusion
* fix style
* fix code style
* update workflow and CI
* update results
* propagate changes to stateless3
* add decoding results for stateless3+giga
* fix CI
2022-11-30 17:26:05 +08:00
Daniel Povey
1d5c03f85a
Merge pull request #705 from glynpu/improve_diagnostic
...
[ready]show dominant parameters
2022-11-29 20:00:52 +08:00