mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-09-19 05:54:20 +00:00
Refactoring that does not affect results.
This commit is contained in:
parent
3110138ab5
commit
17bc002e6e
@ -124,10 +124,9 @@ param_rms_smooth1: Smoothing proportion for parameter matrix, if assumed rank of
|
|||||||
|
|
||||||
(4) is for smoothing the grad covariance used for (2)
|
(4) is for smoothing the grad covariance used for (2)
|
||||||
|
|
||||||
(5) is for smoothing the final learning-rate matrix Z relative to
|
(5) is for smoothing the inverse Z^{-1} final learning-rate matrix Z relative to
|
||||||
its own diagonal. Only the cov_max is actually used.
|
its own diagonal. Only the cov_min[4] is actually used, we ignore
|
||||||
the cov_min ends up not affecting the result, so we set it
|
cov_max[4]
|
||||||
to 0.0.
|
|
||||||
cov_pow: This was mainly added for development and experimentation purposes;
|
cov_pow: This was mainly added for development and experimentation purposes;
|
||||||
it allows you to smooth the parameter covariance matrices at the
|
it allows you to smooth the parameter covariance matrices at the
|
||||||
stages (1), (2), (3) of smoothing mentioned above, and also
|
stages (1), (2), (3) of smoothing mentioned above, and also
|
||||||
@ -167,8 +166,8 @@ param_rms_smooth1: Smoothing proportion for parameter matrix, if assumed rank of
|
|||||||
lr=3e-02,
|
lr=3e-02,
|
||||||
betas=(0.9, 0.98),
|
betas=(0.9, 0.98),
|
||||||
size_lr_scale=0.1,
|
size_lr_scale=0.1,
|
||||||
cov_min=(0.025, 0.0025, 0.02, 0.0001, 0.0),
|
cov_min=(0.025, 0.0025, 0.02, 0.0001, 0.1),
|
||||||
cov_max=(10.0, 80.0, 5.0, 400.0, 10.0),
|
cov_max=(10.0, 80.0, 5.0, 400.0, 100.0),
|
||||||
cov_pow=(1.0, 1.0, 1.0, 1.0),
|
cov_pow=(1.0, 1.0, 1.0, 1.0),
|
||||||
param_rms_smooth0=0.4,
|
param_rms_smooth0=0.4,
|
||||||
param_rms_smooth1=0.2,
|
param_rms_smooth1=0.2,
|
||||||
@ -974,7 +973,7 @@ param_rms_smooth1: Smoothing proportion for parameter matrix, if assumed rank of
|
|||||||
# this is smoothing Z relative to its own diagonal. This is z_inv,
|
# this is smoothing Z relative to its own diagonal. This is z_inv,
|
||||||
# so by applying a minimum here, we are applying a maximum of the
|
# so by applying a minimum here, we are applying a maximum of the
|
||||||
# eigs of Z after normalizing so the diagonal is 1.
|
# eigs of Z after normalizing so the diagonal is 1.
|
||||||
Z_prime_inv_diag *= (1. + 1. / group["cov_max"][4])
|
Z_prime_inv_diag *= (1. + group["cov_min"][4])
|
||||||
|
|
||||||
# We really want the SVD on Z, which will be used for the learning-rate matrix
|
# We really want the SVD on Z, which will be used for the learning-rate matrix
|
||||||
# Q, but Z_prime is better, numerically, to work on because it's closer to
|
# Q, but Z_prime is better, numerically, to work on because it's closer to
|
||||||
|
Loading…
x
Reference in New Issue
Block a user