53 Commits

Author SHA1 Message Date
yaozengwei
42800f775e remove score sorting in test mode 2023-08-02 19:26:48 +08:00
Daniel Povey
d0309c3f3d Increase penalty cutoff in NonlinAttention to 40. 2023-05-29 23:02:59 +08:00
Daniel Povey
265e190946 Penalize large values in NonlinAttentionModule 2023-05-29 19:17:47 +08:00
Daniel Povey
cbd59b9c68 Don't skip penalize_abs_values_gt due to memory cutoff; remove grad_scale=0.1 2023-05-29 16:29:48 +08:00
Daniel Povey
7fdd125ba9 Merge branch 'zlm50' into zlm51 2023-05-29 13:54:53 +08:00
Daniel Povey
f05f1a6353 Increase grad_scale and prob in score_balancer 2023-05-29 13:20:07 +08:00
Daniel Povey
0f27b14376 Support unbalanced structures 2023-05-29 13:13:29 +08:00
Daniel Povey
b85012aa0b Merge branch 'zlm49' into zlm51 2023-05-29 12:20:43 +08:00
Daniel Povey
42f3ad0a11 Remove grad_scale=0.1 2023-05-29 11:55:18 +08:00
Daniel Povey
16e51a7deb remove find_unused_parameters=True and use bypass module 2023-05-29 11:54:21 +08:00
Daniel Povey
d975d59c7d remove bypass_scale 2023-05-29 11:46:18 +08:00
Daniel Povey
d950496d5a Increase grad_scale in score_balancer 2023-05-29 10:56:01 +08:00
Daniel Povey
137ac513bf Some changes to try to reduce mem consumption; decrease batch size 2023-05-28 21:50:34 +08:00
Daniel Povey
625e39fd1a Avoid penalize_abs_values_gt when memory usage high 2023-05-28 20:40:47 +08:00
Daniel Povey
bc55fb96eb Set final skip/bypass rates to zero 2023-05-28 16:30:28 +08:00
Daniel Povey
8483ca2e8f More partial work 2023-05-24 16:04:05 +08:00
Daniel Povey
e51a2c9170 Partial work 2023-05-23 14:01:04 +08:00
Daniel Povey
3a71a53d8d Set lr_factor on to_scores, max_abs=4.0 on balancer 2023-05-23 10:56:03 +08:00
Daniel Povey
c1de4cc847 Remove factor of 2 in weights_discarded 2023-05-19 20:13:12 +08:00
Daniel Povey
4a425f7eb5 Half the time, flip weights_discarded 2023-05-19 18:04:05 +08:00
Daniel Povey
5fc0cce553 Introduce factor of 2 to more strongly penalize discarded weights. 2023-05-19 16:31:45 +08:00
Daniel Povey
fb758b3540 Fix f-string bug 2023-05-18 22:29:13 +08:00
Daniel Povey
769033c857 Increase eps; make it added not applied as floor. 2023-05-18 20:08:19 +08:00
Daniel Povey
57a023902c Remove flipping of weights; reduce eps. 2023-05-18 19:50:16 +08:00
Daniel Povey
c487f9a0ef Try removing weight_scale 2023-05-18 18:41:39 +08:00
Daniel Povey
d2c198c072 Implement weight_scale, set weight_scale=10 2023-05-18 15:48:14 +08:00
Daniel Povey
f6c7392430 Bug fix 2023-05-18 15:37:33 +08:00
Daniel Povey
cdfa388ac0 Revert optim schedule 2023-05-18 15:35:23 +08:00
Daniel Povey
299482d02d More debug print 2023-05-18 15:12:57 +08:00
Daniel Povey
76e6726178 Implement random rotation of dims 2023-05-18 14:56:44 +08:00
Daniel Povey
d631ffec5b indentation change 2023-05-18 14:49:56 +08:00
Daniel Povey
e976af699e Remove unused variable 2023-05-18 14:17:31 +08:00
Daniel Povey
a514d23df7 Change how we penalize weights 2023-05-18 14:14:50 +08:00
Daniel Povey
9367ea3646 Don't drop last batch 2023-05-18 12:47:28 +08:00
Daniel Povey
24e8a7a8fd Remove pointless assertion 2023-05-17 14:54:29 +08:00
Daniel Povey
62c34f15c6 Remove print statement 2023-05-17 13:22:02 +08:00
Daniel Povey
53410608a6 Try to implement test mode; fix issue where middle stack had not been
downsampled.
2023-05-17 13:03:19 +08:00
Daniel Povey
399a79ace6 Change chunk-size setup 2023-05-16 19:47:23 +08:00
Daniel Povey
e062c71076 Efficiency, small fix 2023-05-16 17:34:21 +08:00
Daniel Povey
cf93d1f129 Bug fix regarding chunk-size reshaping 2023-05-16 17:30:48 +08:00
Daniel Povey
5f5df4367d Fix error in how src was reshaped 2023-05-16 17:19:47 +08:00
Daniel Povey
3f72813a96 Various bug fixes, implementing chunking 2023-05-16 16:27:09 +08:00
Daniel Povey
0006a4c4db Implement chunk sizes, to the extent that the program runs. 2023-05-16 16:13:20 +08:00
Daniel Povey
8001a46758 Fix bugs 2023-05-15 22:49:43 +08:00
Daniel Povey
cc81ec4f8a bug fix 2023-05-15 22:07:27 +08:00
Daniel Povey
0a76215fd7 Code cleanup 2023-05-15 22:01:19 +08:00
Daniel Povey
d2d0ce0335 Try to get rid of gradient blowup 2023-05-15 20:26:21 +08:00
Daniel Povey
a397a5973b Increase num parameters 2023-05-15 20:11:20 +08:00
Daniel Povey
047c6ffc58 First version of subformer that runs. 2023-05-15 16:03:01 +08:00
Daniel Povey
1b8be0744f Fix various bugs 2023-05-15 15:20:02 +08:00