icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Wei Kang	11d816d174	Add cumstomized score for hotwords (#1385 ) * add custom score for each hotword * Add more comments * Fix deocde * fix style * minor fixes	2023-11-18 18:47:55 +08:00
Fangjun Kuang	666d69b20d	Rename train2.py to avoid confusion (#1386 )	2023-11-17 18:12:59 +08:00
zr_jin	23913f6afd	Minor refinements for some stale but recently merged PRs (#1354 ) * incorporate https://github.com/k2-fsa/icefall/pull/1269 * incorporate https://github.com/k2-fsa/icefall/pull/1301 * black formatted * incorporate https://github.com/k2-fsa/icefall/pull/1162 * black formatted	2023-10-31 10:28:20 +08:00
zr_jin	1814bbb0e7	typo fixed (#1334 )	2023-10-25 00:03:33 +08:00
zr_jin	d76c3fe472	Migrate zipformer model to other Chinese datasets (#1216 ) added zipformer recipe for AISHELL-1	2023-10-24 16:24:46 +08:00
zr_jin	92ef561ff7	Minor fixes for torch.jit.script support (#1329 )	2023-10-24 01:10:50 +08:00
zr_jin	d2bd0933b1	Compatibility with the latest Lhotse (#1314 )	2023-10-17 21:22:32 +08:00
zr_jin	1ef349d120	[WIP] AISHELL-1 pruned transducer stateless7 streaming recipe (#1300 ) * `pruned_transudcer_stateless7_streaming` for AISHELL-1 * Update train.py * Update train2.py * Update decode.py * Update RESULTS.md	2023-10-16 16:28:16 +08:00
zr_jin	162ceaf4b3	fixes for data preparation (#1307 ) Issue: #1306	2023-10-12 17:05:41 +08:00
zr_jin	0d09a44930	Update train.py (#1299 )	2023-10-11 10:06:00 +08:00
Fangjun Kuang	f14b673408	Add HLG decoding with OpenFst on CPU for aishell conformer_ctc (#1279 )	2023-10-01 13:46:16 +08:00
yaguang	8181d19860	check bbpe model exists in advance. (#1277 )	2023-09-27 17:35:26 +08:00
yaguang	a5ba1133c4	Compatible with new lhotse versions. (#1278 )	2023-09-27 17:33:38 +08:00
zr_jin	ef658d691e	fixes for init value of `diagnostics.TensorDiagnosticOptions` (#1269 ) * fixes for `diagnostics` Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions` also black formatted some scripts * fixed formatting issues	2023-09-24 17:06:47 +08:00
Fangjun Kuang	34e40a86b3	Fix exporting decoder model to onnx (#1264 ) * Use torch.jit.script() to export the decoder model See also https://github.com/k2-fsa/sherpa-onnx/issues/327	2023-09-22 09:57:15 +08:00
Fangjun Kuang	f5dc957d44	Fix CI tests (#1266 )	2023-09-21 21:16:14 +08:00
zr_jin	7cc2dae940	Fixes to incorporate with the latest Lhotse release (#1249 )	2023-09-13 12:39:49 +08:00
zr_jin	a81396b482	Use tokens.txt to replace bpe.model (#1162 )	2023-08-12 16:53:59 +08:00
zr_jin	74806b744b	disable speed perturbation by default (#1176 ) * disable speed perturbation by default * minor fixes * minor updates * updated bash scripts to incorporate with the `speed-perturb` arg * minor fixes 1. changed the naming scheme from `speed-perturb` to `perturb-speed` to align with the librispeech recipe >> `00256a7669/egs/librispeech/ASR/local/compute_fbank_librispeech.py (L65)` 2. changed arg type for `perturb-speed` to str2bool	2023-08-10 20:56:02 +08:00
zr_jin	856c0f2a60	fixed default param for an aishell recipe (#1159 )	2023-07-04 19:12:39 +08:00
Wei Kang	219bba1310	zipformer wenetspeech (#1130 ) * copy files * update train.py * small fixes * Add decode.py * Fix dataloader in decode.py * add blank penalty * Add blank-penalty to other decoding method * Minor fixes * add zipformer2 recipe * Minor fixes * Remove pruned7 * export and test models * Replace bpe with tokens in export.py and pretrain.py * Minor fixes * Minor fixes * Minor fixes * Fix export * Update results * Fix zipformer-ctc * Fix ci * Fix ci * Fix CI * Fix CI --------- Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>	2023-06-26 09:33:18 +08:00
Wei Kang	ba257efbcd	Add Context biasing (#1038 ) * Add context biasing for librispeech * Add context biasing for wenetspeech * fix bugs * Implement Aho-Corasick context graph * fix some bugs * Fixes to forward_one_step; add draw to context graph * add output arc; fix black * Fix wenetspeech tokenizer * Minor fixes to the decode.py	2023-06-03 21:28:49 +08:00
Fangjun Kuang	7b0afbdc16	Remove cur_batch_idx (#1102 )	2023-05-30 14:49:54 +08:00
marcoyang1998	585e7b224f	Aishell pruned_transducer_stateless7 (#962 ) * Add pruned_transducer_stateless7 for Aishell * update README.md * update comments and small fixes	2023-05-23 11:04:33 +08:00
Wei Kang	80156dda09	Training with byte level BPE (AIShell) (#986 ) * copy files from zipformer librispeech * Add byte bpe training for aishell * compile LG graph * Support LG decoding * Minor fixes * black * Minor fixes * export & fix pretrain.py * fix black * Update RESULTS.md * Fix export.py	2023-05-04 19:16:17 +08:00
Wei Kang	0efed1cec5	Fix path in aishell rnnlm training (#1016 )	2023-04-20 23:09:31 +08:00
Wei Kang	5c65516e05	Fix aishell rnnlm training command (#1015 )	2023-04-20 16:14:16 +08:00
marcoyang1998	d337398d29	Shallow fusion for Aishell (#954 ) * add shallow fusion and LODR for aishell * update RESULTS * add save by iterations	2023-04-03 16:20:29 +08:00
Fangjun Kuang	35e21a0d2e	Fix torchscript export for aishell (#969 )	2023-03-27 14:08:26 +08:00
Jason's Lab	6196b4a407	Add char-based language model training process for aishell. (#945 ) * Add char-based language model training process for aishell. Add soft link from librispeech/ASR/local/sort_lm_training_data.py to aishell/ASR/local/ --------- Co-authored-by: lichao <www.563042811@qq.com>	2023-03-16 09:52:11 +08:00
Fangjun Kuang	f5de2e90c6	Fix style issues. (#937 )	2023-03-08 22:56:04 +08:00
pehonnet	07243d136a	remove key from result filename (#936 ) Co-authored-by: pe-honnet <pe.honnet@telepathy.ai>	2023-03-08 21:06:07 +08:00
Meng Wei	74a2069f94	fix expired links (#856 )	2023-01-28 14:43:47 +08:00
marcoyang	53454701cb	fix segmentation fault	2022-11-22 11:39:21 +08:00
Desh Raj	d31db01037	manual correction of black formatting	2022-11-17 14:18:05 -05:00
Desh Raj	107df3b115	apply black on all files	2022-11-17 09:42:17 -05:00
Fangjun Kuang	60317120ca	Revert "Apply new Black style changes"	2022-11-17 20:19:32 +08:00
Desh Raj	d110b04ad3	apply new black formatting to all files	2022-11-16 13:06:43 -05:00
Fangjun Kuang	ff3f026381	Checkout the LM for aishell explicitly (#642 )	2022-10-31 19:47:43 +08:00
Fangjun Kuang	d1f16a04bd	fix type hints for decode.py (#623 )	2022-10-18 06:56:12 +08:00
LIyong.Guo	923b60a7c6	padding zeros (#591 )	2022-09-28 21:20:33 +08:00
Fangjun Kuang	e18fa78c3a	Check that read_manifests_if_cached returns a non-empty dict. (#555 )	2022-08-28 11:50:11 +08:00
Lucky Wong	9277c95bcd	Pruned transducer stateless2 for AISHELL-1 (#536 ) * Fix not enough values to unpack error . * [WIP] Pruned transducer stateless2 for AISHELL-1 * fix the style issue * code format for black * add pruned-transducer-stateless2 results for AISHELL-1 * simplify result	2022-08-22 10:17:26 +08:00
Lucky Wong	31686ac829	Fix not enough values to unpack error . (#533 )	2022-08-18 10:45:06 +08:00
Wei Kang	5c17255eec	Sort results to make it more convenient to compare decoding results (#522 ) * Sort result to make it more convenient to compare decoding results * Add cut_id to recognition results * add cut_id to results for all recipes * Fix torch.jit.script * Fix comments * Minor fixes * Fix torch.jit.tracing for Pytorch version before v1.9.0	2022-08-12 07:12:50 +08:00
Fangjun Kuang	5149788cb2	Fix computing averaged loss in the aishell recipe. (#523 ) * Fix computing averaged loss in the aishell recipe. * Set find_unused_parameters optionally.	2022-08-09 10:53:31 +08:00
boji123	3c9e7f733b	[debug] raise remind when git-lfs not available (#504 ) * [debug] raise remind when git-lfs not available * modify comment	2022-07-28 16:17:49 +08:00
Fangjun Kuang	385645d533	Fix get_transducer_model() for aishell. (#497 ) PR #495 introduces an error. This commit fixes it.	2022-07-26 15:42:21 +08:00
Fangjun Kuang	d3fc4b031e	Support using aidatatang_200zh optionally in aishell training (#495 ) * Use aidatatang_200zh optionally in aishell training.	2022-07-26 11:25:01 +08:00
Jun Wang	d792bdc9bc	fix typo (#445 )	2022-06-25 11:00:53 +08:00

1 2

82 Commits