Change how formula for max_lr_factor works, and increase factor from 2.5 to 3.

This commit is contained in:
Daniel Povey 2022-07-19 06:54:49 +08:00
parent 525c097130
commit 79a2f09f62

View File

@ -145,7 +145,7 @@ param_rms_smooth1: Smoothing proportion for parameter matrix, if assumed rank of
param_pow=0.75, param_pow=0.75,
param_rms_smooth0=0.75, param_rms_smooth0=0.75,
param_rms_smooth1=0.25, param_rms_smooth1=0.25,
max_lr_factor=2.5, max_lr_factor=3.0,
eps=1.0e-08, eps=1.0e-08,
param_min_rms=1.0e-05, param_min_rms=1.0e-05,
param_max_rms=2.0, param_max_rms=2.0,
@ -1016,11 +1016,10 @@ param_rms_smooth1: Smoothing proportion for parameter matrix, if assumed rank of
ans = rms / new_mean ans = rms / new_mean
# Apply max_lr_factor; approach the constraint in 2 steps because it # Apply a `soft min` of max_lr_factor via the formula
# changes the mean, and it's relative to the mean. # softmin(x,y) = 1/(1/x + 1/y).
ans.clamp_(max=max_lr_factor * 2) ans = 1. / (1. / ans + 1. / max_lr_factor)
ans /= _mean(ans, exclude_dims=[0], keepdim=True) # and renormalize to mean=1.
ans.clamp_(max=max_lr_factor)
ans /= _mean(ans, exclude_dims=[0], keepdim=True) ans /= _mean(ans, exclude_dims=[0], keepdim=True)
return ans return ans