Daniel Povey
|
8a9bbb93bc
|
Cosmetic fixes
|
2022-07-24 04:45:57 +08:00 |
|
Daniel Povey
|
966ac36cde
|
Fixes to comments
|
2022-07-24 04:36:41 +08:00 |
|
Daniel Povey
|
33ffd17515
|
Some cleanup
|
2022-07-24 04:22:11 +08:00 |
|
Daniel Povey
|
ddceb7963b
|
Interpolate between iterative estimate of scale, and original value.
|
2022-07-23 15:27:48 +08:00 |
|
Daniel Povey
|
2c4bdd0ad0
|
Add _update_param_scales_simple(), add documentation
|
2022-07-23 14:49:58 +08:00 |
|
Daniel Povey
|
9730352257
|
Redce smoothing constant slightly
|
2022-07-23 13:12:31 +08:00 |
|
Daniel Povey
|
e1873fc0bb
|
Tune phase2 again, from 0.005,5.0 to 0.01,40. Epoch 140 is 0.21/0.149
|
2022-07-23 10:10:01 +08:00 |
|
Daniel Povey
|
0fc58bac56
|
More tuning, epoch-140 results are 0.23,0.11
|
2022-07-23 09:52:51 +08:00 |
|
Daniel Povey
|
34a2d331bf
|
Smooth in opposite orientation to G
|
2022-07-23 09:38:16 +08:00 |
|
Daniel Povey
|
a972655a70
|
Tuning.
|
2022-07-23 09:15:49 +08:00 |
|
Daniel Povey
|
b47433b77a
|
Fix bug in smooth_cov, for power==1.0
|
2022-07-23 09:06:03 +08:00 |
|
Daniel Povey
|
cc388675a9
|
Bug fix RE rankj
|
2022-07-23 08:24:59 +08:00 |
|
Daniel Povey
|
dee496145d
|
this version performs way worse but has bugs fixed, can optimize from here.
|
2022-07-23 08:11:20 +08:00 |
|
Daniel Povey
|
dd10eb140f
|
First version after refactorization and changing the math, where optim.py runs
|
2022-07-23 06:32:56 +08:00 |
|
Daniel Povey
|
4da4e69fba
|
Draft of new way of smoothing param_rms, diagonalized by grad
|
2022-07-22 06:37:20 +08:00 |
|
Daniel Povey
|
a63afe348a
|
Increase max_lr_factor from 3.0 to 4.0
|
2022-07-19 06:56:41 +08:00 |
|
Daniel Povey
|
79a2f09f62
|
Change how formula for max_lr_factor works, and increase factor from 2.5 to 3.
|
2022-07-19 06:54:49 +08:00 |
|
Daniel Povey
|
525c097130
|
Increase power from 0.7 to 0.75
|
2022-07-19 05:44:03 +08:00 |
|
Daniel Povey
|
2dff1161b4
|
Reduce max_lr_factor from 3.0 to 2.5
|
2022-07-19 05:15:03 +08:00 |
|
Daniel Povey
|
8bb44b2944
|
Change param_pow from 0.6 to 0.7
|
2022-07-19 05:08:32 +08:00 |
|
Daniel Povey
|
bb1e1e154a
|
Increasing param_pow to 0.6 and decreasing max_lr_factor from 4 to 3.
|
2022-07-18 09:06:32 +08:00 |
|
Daniel Povey
|
8db3b48edb
|
Update parameter dependent part of cov more slowly, plus bug fix.
|
2022-07-18 05:26:55 +08:00 |
|
Daniel Povey
|
198cf2635c
|
Reduce param_pow from 0.5 to 0.4.
|
2022-07-17 15:35:07 +08:00 |
|
Daniel Povey
|
3857a87b47
|
Merge branch 'merge_refactor_param_cov_norank1_iter_batch_max4.0_pow0.5_fix2r_lrupdate200_2k_ns' into merge2_refactor_max4.0_pow0.5_200_1k_ma3.0
|
2022-07-17 15:32:43 +08:00 |
|
Daniel Povey
|
a572eb4e33
|
Reducing final lr_update_period from 2k to 1k
|
2022-07-17 12:56:02 +08:00 |
|
Daniel Povey
|
7e88e2a0e9
|
Increase debug freq; add type to diagnostics and increase precision of mean,rms
|
2022-07-17 06:40:16 +08:00 |
|
Daniel Povey
|
23e6d2e6d8
|
Fix to the fix
|
2022-07-16 06:53:44 +08:00 |
|
Daniel Povey
|
4c8d77d14a
|
Fix return type
|
2022-07-15 14:18:07 +08:00 |
|
Daniel Povey
|
68c5935691
|
Fix bug re param_cov freshness, properly.
|
2022-07-15 08:33:10 +08:00 |
|
Daniel Povey
|
b6ee698278
|
Make LR update period less frequent later in training; fix bug with param_cov freshness, was too fresh
|
2022-07-15 07:59:30 +08:00 |
|
Daniel Povey
|
689441b237
|
Reduce param_pow from 0.75 to 0.5
|
2022-07-14 06:08:06 +08:00 |
|
Daniel Povey
|
7f6fe02db9
|
Fix formula for smoothing (was applying more smoothing than intended, and in the opposite sense to intended), also revert max_rms from 2.0 to 4.0
|
2022-07-14 06:06:02 +08:00 |
|
Daniel Povey
|
4785245e5c
|
Reduce debug freq
|
2022-07-13 06:51:23 +08:00 |
|
Daniel Povey
|
d48fe0b99c
|
Change max rms from 10.0 to 4.0
|
2022-07-13 05:53:35 +08:00 |
|
Daniel Povey
|
cedfb5a377
|
Make max eig ratio 10
|
2022-07-12 13:59:58 +08:00 |
|
Daniel Povey
|
278358bb9f
|
Remove debug code
|
2022-07-12 08:39:14 +08:00 |
|
Daniel Povey
|
8c44ff26f7
|
Fix bug in batching code for scalars
|
2022-07-12 08:36:45 +08:00 |
|
Daniel Povey
|
41df045773
|
Simplify formula, getting rid of scalar_exp_avg_sq
|
2022-07-11 17:14:12 -07:00 |
|
Daniel Povey
|
4f0e219523
|
Bug fix to reproduce past results with max_block_size unset.
|
2022-07-11 17:03:32 -07:00 |
|
Daniel Povey
|
075a2e27d8
|
Replace max_fullcov_size with max_block_size
|
2022-07-11 16:37:01 -07:00 |
|
Daniel Povey
|
3468c3aa5a
|
Remove ActivationBalancer, unnecessary
|
2022-07-11 14:12:24 -07:00 |
|
Daniel Povey
|
7993c84cd6
|
Apparently working version, with changed test-code topology
|
2022-07-11 13:17:29 -07:00 |
|
Daniel Povey
|
245d39b1bb
|
Still debugging but close to done
|
2022-07-11 00:33:37 -07:00 |
|
Daniel Povey
|
27da50a1f6
|
Committing partial work..
|
2022-07-10 15:46:32 -07:00 |
|
Daniel Povey
|
d25df4af5e
|
Slight refactoring, preparing for batching.
|
2022-07-09 22:24:36 -07:00 |
|
Daniel Povey
|
d9a6180ae0
|
Bug fix
|
2022-07-10 10:20:39 +08:00 |
|
Daniel Povey
|
b7035844a2
|
Introduce scalar_max, stop eps getting large or small
|
2022-07-10 10:13:55 +08:00 |
|
Daniel Povey
|
2f73434541
|
Reduce debug frequency
|
2022-07-10 06:44:50 +08:00 |
|
Daniel Povey
|
b3bb2dac6f
|
Iterative, more principled way of estimating param_cov
|
2022-07-10 06:28:01 +08:00 |
|
Daniel Povey
|
d139c18f22
|
Max eig of Q limited to 5 times the mean
|
2022-07-09 14:30:03 +08:00 |
|