marcoyang1998
d84631c403
Merge cc168d104128348e9e24835c856c1bd946638e71 into 231bbcd2b638826a94cf019fa31ae8683d3552ee
2023-11-03 17:07:28 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs ( #1354 )
...
* incorporate https://github.com/k2-fsa/icefall/pull/1269
* incorporate https://github.com/k2-fsa/icefall/pull/1301
* black formatted
* incorporate https://github.com/k2-fsa/icefall/pull/1162
* black formatted
2023-10-31 10:28:20 +08:00
zr_jin
f9980aa606
minor fixes ( #1332 )
2023-10-24 08:17:17 +08:00
zr_jin
92ef561ff7
Minor fixes for torch.jit.script support ( #1329 )
2023-10-24 01:10:50 +08:00
marcoyang1998
ce372cce33
Update documentation to PromptASR ( #1321 )
2023-10-19 17:24:31 +08:00
marcoyang1998
16a2748d6c
PromptASR for contextualized ASR with controllable style ( #1250 )
...
* Add PromptASR with BERT as text encoder
* Support using word-list based content prompts for context biasing
* Upload the pretrained models to huggingface
* Add usage example
2023-10-11 14:56:41 +08:00
marcoyang1998
cc168d1041
update the pipeline
2023-08-09 12:11:43 +08:00
marcoyang1998
b8540ac3c0
minor fix
2023-07-20 15:51:34 +08:00
marcoyang1998
754ac00509
add more normalizations such as number/year to words; fix a few bugs when feeding input to WER computation
2023-07-20 15:50:50 +08:00
marcoyang1998
5532bb1683
add files for decoding
2023-07-19 22:05:53 +08:00
marcoyang1998
4f3a6606ad
add necessary files for training
2023-07-19 22:04:11 +08:00
marcoyang1998
88a311734d
add script to prepare validation and test sets
2023-07-19 11:01:07 +08:00
marcoyang1998
0aee07fb4c
change the valid/test sets; only do simple normalization in the dataloader, i.e only replace full-width symbol, replace double hyphen with space
2023-07-19 11:00:07 +08:00
marcoyang1998
0d1cd4f595
add char coverage option to avoid having a lot of rarely used tokens in the BPE; add the option to use byte-fallback in training BPE
2023-07-19 10:55:57 +08:00
marcoyang1998
b53c0d1e5f
initial commit for zipformer recipe
2023-07-18 11:42:19 +08:00
marcoyang1998
6939b3d6aa
minor fixes
2023-07-18 11:14:06 +08:00
marcoyang
0e7df7c5c4
add necessary utility files
2023-07-18 10:06:22 +08:00
marcoyang
189d424b25
only use medium text to train the BPE as the whole corpus is tooooo large
2023-07-18 10:06:01 +08:00
marcoyang
fef229e024
add necessary files to compute features
2023-07-17 10:36:25 +08:00
marcoyang
44d01195c0
initial commit for libriheavy
2023-07-14 23:50:27 +08:00