5 Commits

Author SHA1 Message Date
marcoyang1998
0982db9cde add a few args to support context list and rare words 2023-08-16 16:44:58 +08:00
marcoyang1998
4420788f66 support using context list and random substring as pre text 2023-08-16 16:44:29 +08:00
marcoyang1998
17d0918969 fix the post normalization bug, avoid multiple words 2023-08-16 09:39:42 +08:00
marcoyang1998
fdc4fcabb9 use a more aggresive sampling_weight 2023-08-16 09:38:40 +08:00
marcoyang1998
ae4d2fbfcc initial commit 2023-08-14 09:51:20 +08:00