Add clamping operation in Eve optimizer for all scalar weights to avoid (#550)

non stable training in some scenarios. The clamping range is set to (-10,2).
 Note that this change may cause unexpected effect if you resume
training from a model that is trained without clamping.
This commit is contained in:
marcoyang1998 2022-08-25 12:12:50 +08:00 committed by GitHub
parent 0967cf5b38
commit 1e31fbcd7d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -164,6 +164,10 @@ class Eve(Optimizer):
p.mul_(1 - (weight_decay * is_above_target_rms))
p.addcdiv_(exp_avg, denom, value=-step_size)
# Constrain the range of scalar weights
if p.numel() == 1:
p.clamp_(min=-10, max=2)
return loss