96 Commits

Author SHA1 Message Date
jinzr
b504ac314f minor fixes 2023-10-24 11:35:53 +08:00
jinzr
6eb141f0c5 minor updates 2023-10-24 11:01:44 +08:00
zr_jin
401e37ffa0
Merge branch 'k2-fsa:master' into dev_zipformer_cn 2023-10-24 01:54:50 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support (#1329) 2023-10-24 01:10:50 +08:00
zr_jin
3eb55080aa
Merge branch 'k2-fsa:master' into dev_zipformer_cn 2023-10-23 09:11:14 +08:00
zr_jin
d2bd0933b1
Compatibility with the latest Lhotse (#1314) 2023-10-17 21:22:32 +08:00
zr_jin
1ef349d120
[WIP] AISHELL-1 pruned transducer stateless7 streaming recipe (#1300)
* `pruned_transudcer_stateless7_streaming` for AISHELL-1

* Update train.py

* Update train2.py

* Update decode.py

* Update RESULTS.md
2023-10-16 16:28:16 +08:00
zr_jin
162ceaf4b3
fixes for data preparation (#1307)
Issue: #1306
2023-10-12 17:05:41 +08:00
zr_jin
0d09a44930
Update train.py (#1299) 2023-10-11 10:06:00 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc (#1279) 2023-10-01 13:46:16 +08:00
yaguang
8181d19860
check bbpe model exists in advance. (#1277) 2023-09-27 17:35:26 +08:00
yaguang
a5ba1133c4
Compatible with new lhotse versions. (#1278) 2023-09-27 17:33:38 +08:00
zr_jin
ef658d691e
fixes for init value of diagnostics.TensorDiagnosticOptions (#1269)
* fixes for `diagnostics`

Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`

also black formatted some scripts

* fixed formatting issues
2023-09-24 17:06:47 +08:00
zr_jin
023f6e05d4
Merge branch 'k2-fsa:master' into dev_zipformer_cn 2023-09-22 19:18:31 +08:00
Fangjun Kuang
34e40a86b3
Fix exporting decoder model to onnx (#1264)
* Use torch.jit.script() to export the decoder model

See also https://github.com/k2-fsa/sherpa-onnx/issues/327
2023-09-22 09:57:15 +08:00
Fangjun Kuang
f5dc957d44
Fix CI tests (#1266) 2023-09-21 21:16:14 +08:00
zr_jin
7cc2dae940
Fixes to incorporate with the latest Lhotse release (#1249) 2023-09-13 12:39:49 +08:00
zr_jin
d3efac6618 doc str updated 2023-08-30 11:02:55 +08:00
zr_jin
c65f80a11f this commit should finalize the PR (hopefully) 2023-08-30 10:50:34 +08:00
zr_jin
9bc287ca03 Update RESULTS.md 2023-08-17 16:52:47 +08:00
zr_jin
ce380a5fb3 minor updates 2023-08-17 09:42:03 +08:00
zr_jin
c5bed3e4de Merge branch 'dev_zipformer_cn' of https://github.com/JinZr/icefall into dev_zipformer_cn 2023-08-16 11:50:39 +08:00
zr_jin
3ba89391a7 Update RESULTS.md 2023-08-16 11:50:36 +08:00
JinZr
241718964f minor updates 2023-08-16 10:35:27 +08:00
jinzr
4200126f9b fixed several formatting issues 2023-08-14 13:54:19 +08:00
jinzr
658ec630d3 updated .md files for aishell and aishell4 recipes 2023-08-14 01:25:00 +08:00
JinZr
53b7aead10 Merge branch 'dev_zipformer_cn' of https://github.com/JinZr/icefall into dev_zipformer_cn 2023-08-14 01:10:03 +08:00
JinZr
6a41afe589 minor fixes 2023-08-14 01:05:09 +08:00
jinzr
7bd54936db Update RESULTS.md 2023-08-14 00:54:39 +08:00
zr_jin
4a7e2d708d
Merge branch 'k2-fsa:master' into dev_zipformer_cn 2023-08-13 01:20:02 +08:00
jinzr
f36e6e08d0 Aishell Zipformer Recipe 2023-08-13 00:51:36 +08:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model (#1162) 2023-08-12 16:53:59 +08:00
zr_jin
74806b744b
disable speed perturbation by default (#1176)
* disable speed perturbation by default

* minor fixes

* minor updates

* updated bash scripts to incorporate with the `speed-perturb` arg

* minor fixes

1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe

>> 00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)

2. changed arg type for `perturb-speed` to str2bool
2023-08-10 20:56:02 +08:00
zr_jin
856c0f2a60
fixed default param for an aishell recipe (#1159) 2023-07-04 19:12:39 +08:00
Wei Kang
219bba1310
zipformer wenetspeech (#1130)
* copy files

* update train.py

* small fixes

* Add decode.py

* Fix dataloader in decode.py

* add blank penalty

* Add blank-penalty to other decoding method

* Minor fixes

* add zipformer2 recipe

* Minor fixes

* Remove pruned7

* export and test models

* Replace bpe with tokens in export.py and pretrain.py

* Minor fixes

* Minor fixes

* Minor fixes

* Fix export

* Update results

* Fix zipformer-ctc

* Fix ci

* Fix ci

* Fix CI

* Fix CI

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-06-26 09:33:18 +08:00
Wei Kang
ba257efbcd
Add Context biasing (#1038)
* Add context biasing for librispeech

* Add context biasing for wenetspeech

* fix bugs

* Implement Aho-Corasick context graph

* fix some bugs

* Fixes to forward_one_step; add draw to context graph

* add output arc; fix black

* Fix wenetspeech tokenizer

* Minor fixes to the decode.py
2023-06-03 21:28:49 +08:00
Fangjun Kuang
7b0afbdc16
Remove cur_batch_idx (#1102) 2023-05-30 14:49:54 +08:00
marcoyang1998
585e7b224f
Aishell pruned_transducer_stateless7 (#962)
* Add pruned_transducer_stateless7 for Aishell

* update README.md

* update comments and small fixes
2023-05-23 11:04:33 +08:00
Wei Kang
80156dda09
Training with byte level BPE (AIShell) (#986)
* copy files from zipformer librispeech

* Add byte bpe training for aishell

* compile LG graph

* Support LG decoding

* Minor fixes

* black

* Minor fixes

* export & fix pretrain.py

* fix black

* Update RESULTS.md

* Fix export.py
2023-05-04 19:16:17 +08:00
Wei Kang
0efed1cec5
Fix path in aishell rnnlm training (#1016) 2023-04-20 23:09:31 +08:00
Wei Kang
5c65516e05
Fix aishell rnnlm training command (#1015) 2023-04-20 16:14:16 +08:00
marcoyang1998
d337398d29
Shallow fusion for Aishell (#954)
* add shallow fusion and LODR for aishell

* update RESULTS

* add save by iterations
2023-04-03 16:20:29 +08:00
Fangjun Kuang
35e21a0d2e
Fix torchscript export for aishell (#969) 2023-03-27 14:08:26 +08:00
Jason's Lab
6196b4a407
Add char-based language model training process for aishell. (#945)
* Add char-based language model training process for aishell.

Add soft link from librispeech/ASR/local/sort_lm_training_data.py to aishell/ASR/local/

---------

Co-authored-by: lichao <www.563042811@qq.com>
2023-03-16 09:52:11 +08:00
Fangjun Kuang
f5de2e90c6
Fix style issues. (#937) 2023-03-08 22:56:04 +08:00
pehonnet
07243d136a
remove key from result filename (#936)
Co-authored-by: pe-honnet <pe.honnet@telepathy.ai>
2023-03-08 21:06:07 +08:00
Meng Wei
74a2069f94
fix expired links (#856) 2023-01-28 14:43:47 +08:00
marcoyang
53454701cb fix segmentation fault 2022-11-22 11:39:21 +08:00
Desh Raj
d31db01037 manual correction of black formatting 2022-11-17 14:18:05 -05:00
Desh Raj
107df3b115 apply black on all files 2022-11-17 09:42:17 -05:00