Fangjun Kuang
bd7fa2253d
Update the manifest statistics of the L subset of wenetspeech ( #731 )
2022-12-04 20:27:45 +08:00
Desh Raj
107df3b115
apply black on all files
2022-11-17 09:42:17 -05:00
Fangjun Kuang
60317120ca
Revert "Apply new Black style changes"
2022-11-17 20:19:32 +08:00
Desh Raj
d110b04ad3
apply new black formatting to all files
2022-11-16 13:06:43 -05:00
Fangjun Kuang
e18fa78c3a
Check that read_manifests_if_cached returns a non-empty dict. ( #555 )
2022-08-28 11:50:11 +08:00
Fangjun Kuang
d68b8e9120
Disable CUDA_LAUNCH_BLOCKING in wenetspeech recipes. ( #554 )
...
* Disable CUDA_LAUNCH_BLOCKING in wenetspeech recipes.
* minor fixes
2022-08-28 11:17:38 +08:00
Weiji Zhuang
36eacaccb2
Fix preparing char based lang and add multiprocessing for wenetspeech text segmentation ( #513 )
...
* add multiprocessing for wenetspeech text segmentation
* Fix preparing char based lang for wenetspeech
* fix style
Co-authored-by: WeijiZhuang <zhuangweiji@xiaomi.com>
2022-08-03 19:19:40 +08:00
Mingshuang Luo
1b478d3ac3
Add other decoding methods (nbest, nbest oracle, nbest LG) for wenetspeech pruned rnnt2 ( #482 )
...
* add other decoding methods for wenetspeech
* changes for RESULTS.md
* add ngram-lm-scale=0.35 results
* set ngram-lm-scale=0.35 as default
* Update README.md
* add nbest-scale for flie name
2022-07-29 12:03:08 +08:00
Fangjun Kuang
ec69967584
Set overwrite=True when extracting features in batches. ( #487 )
2022-07-29 11:17:19 +08:00
Yuekai Zhang
c17233eca7
[Ready] [Recipes] add aishell2 ( #465 )
...
* add aishell2
* fix aishell2
* add manifest stats
* update prepare char dict
* fix lint
* setting max duration
* lint
* change context size to 1
* update result
* update hf link
* fix decoding comment
* add more decoding methods
* update result
* change context-size 2 default
2022-07-14 14:46:56 +08:00
Mingshuang Luo
8e0b7ea518
mv split cuts before computing feature ( #461 )
2022-07-04 11:59:37 +08:00
Mingshuang Luo
10e8bc5b56
do a change ( #460 )
2022-07-03 19:35:01 +08:00
Fangjun Kuang
ed66877694
Replace ChunkedLilcomHdf5Writer with LilcomChunkyWriter. ( #411 )
2022-06-09 11:18:52 +08:00
Mingshuang Luo
5079d99ee2
a correction for text2segmentation.py ( #407 )
2022-06-08 12:06:57 +08:00
Fangjun Kuang
f1abce72f8
Use jsonl for CutSet in the LibriSpeech recipe. ( #397 )
...
* Use jsonl for cutsets in the librispeech recipe.
* Use lazy cutset for all recipes.
* More fixes to use lazy CutSet.
* Remove force=True from logging to support Python < 3.8
* Minor fixes.
* Fix style issues.
2022-06-06 10:19:16 +08:00
Ewald Enzinger
8c5722de8c
[egs] Add prefix when reading manifests due to recent lhotse changes ( #382 )
...
* [egs] Add prefix when reading manifests due to recent lhotse changes
* Fix wenetspeech
* Fix style issues
2022-05-23 23:37:35 +08:00
Mingshuang Luo
0e57b30495
[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) ( #349 )
...
* add char-based pruned-rnnt2 for wenetspeech
* style check
* style check
* change for export.py
* do some changes
* do some changes
* a small change for .flake8
* solve the conflicts
2022-05-23 17:13:01 +08:00