Daniel Povey
|
1a0155fcb5
|
Merge branch 'scaled_adam_exp863' into scaled_adam_exp870
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/scaling.py
|
2023-01-08 23:36:29 +08:00 |
|
Daniel Povey
|
326cb75033
|
Increase layer_skip_rate slightly
|
2023-01-08 15:48:23 +08:00 |
|
Daniel Povey
|
62b42887b4
|
Revert zipformer.py to status on previous commit
|
2023-01-08 13:17:39 +08:00 |
|
Daniel Povey
|
e952598677
|
Merge branch 'scaled_adam_exp846' into scaled_adam_exp866
|
2023-01-08 13:16:24 +08:00 |
|
Daniel Povey
|
117db124d0
|
Implement higher layerdrop for central stacks
|
2023-01-08 13:16:10 +08:00 |
|
Daniel Povey
|
b3527fe4ac
|
Implement caching evaluation for ConvNeXt
|
2023-01-07 17:31:20 +08:00 |
|
Daniel Povey
|
9b0c0aabb2
|
Merge branch 'scaled_adam_exp829' into scaled_adam_exp860
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
|
2023-01-06 22:24:45 +08:00 |
|
Daniel Povey
|
5564a0efb0
|
Further tune lr scales; increase base-lr
|
2023-01-06 13:34:48 +08:00 |
|
Daniel Povey
|
f6f088489d
|
Adjust lr_scales, make them closer to 1.
|
2023-01-05 23:49:42 +08:00 |
|
Daniel Povey
|
ccc38a97f7
|
Reduce lr_scales of soem sub modules
|
2023-01-05 18:50:04 +08:00 |
|
Daniel Povey
|
95e8296014
|
Use downsampling_factor ** -0.333 as the scale for stacks
|
2023-01-05 14:23:40 +08:00 |
|
Daniel Povey
|
1db509ea31
|
Attempt to implement slower learning for downsampled modules
|
2023-01-05 13:39:22 +08:00 |
|
Daniel Povey
|
22b4a417dd
|
Implement extra_layerdrop
|
2023-01-04 20:59:58 +08:00 |
|
Daniel Povey
|
f688066517
|
Merge branch 'scaled_adam_exp823' into scaled_adam_exp843
|
2023-01-04 17:02:37 +08:00 |
|
Daniel Povey
|
f7d67f5456
|
Higher dropout schedule for SmallConvolutionModule
|
2023-01-02 14:58:23 +08:00 |
|
Daniel Povey
|
5223286424
|
Add SmallConvolutionModule
|
2023-01-02 14:47:28 +08:00 |
|
Daniel Povey
|
a61bd01e5b
|
convnext1 kernel size 5, 5 to 5, 7
|
2023-01-02 14:17:51 +08:00 |
|
Daniel Povey
|
e4d0ac0946
|
Let the feedforward dims be respectively 3*feedforward_dim // 4 and 5*feedforward_dim // 4.
|
2023-01-02 00:24:12 +08:00 |
|
Daniel Povey
|
3a5b3f640d
|
Remove eps from BasicNorm and reintroduce bias
|
2023-01-02 00:02:31 +08:00 |
|
Daniel Povey
|
e52bfb7219
|
Revert final conv_skip_rate from 0.01 to 0.0
|
2023-01-01 22:13:13 +08:00 |
|
Daniel Povey
|
460fb945ec
|
Merge branch 'scaled_adam_exp813' into scaled_adam_exp820
|
2023-01-01 22:12:10 +08:00 |
|
Daniel Povey
|
977d412690
|
Cosmetic fix
|
2023-01-01 21:43:14 +08:00 |
|
Daniel Povey
|
60d491eee6
|
Bug fix
|
2023-01-01 14:31:28 +08:00 |
|
Daniel Povey
|
287bd120be
|
Reduce min_abs of zipformer balancer1; constraints on eps of Conv2dSubsampling.out_norm
|
2023-01-01 14:28:18 +08:00 |
|
Daniel Povey
|
a2815ea0df
|
Increase max_abs of ZipformerEncoderLayer.balancer2 from 1.0 to 4.0.
|
2023-01-01 00:00:26 +08:00 |
|
Daniel Povey
|
63472a19b1
|
Whitespace fix
|
2022-12-31 23:50:09 +08:00 |
|
Daniel Povey
|
4a4d12c994
|
Revert kernel size of convnext2 from 5x5 to 7x7
|
2022-12-31 21:52:11 +08:00 |
|
Daniel Povey
|
d0ae60400e
|
Decrease convnext1 kernel size from 7x7 to 5x5
|
2022-12-31 17:19:02 +08:00 |
|
Daniel Povey
|
d48b2ccb45
|
Reduce kernel size of convnext2 from 7 to 5.
|
2022-12-31 17:10:31 +08:00 |
|
Daniel Povey
|
c533c30442
|
Increase final conv_skip_rate from 0.0 to 0.01
|
2022-12-31 15:10:52 +08:00 |
|
Daniel Povey
|
577c3ad390
|
Adjust balancers of modules; most significant change is to make min_abs of ff2 balancer from 0.5 to 0.1
|
2022-12-31 14:38:00 +08:00 |
|
Daniel Povey
|
c15578d0bb
|
Add balancer_ff2 to avoid too small ff2 module
|
2022-12-31 01:09:17 +08:00 |
|
Daniel Povey
|
9ee4472f36
|
Decrease min_abs at end of feedforward modules from 0.5 to 0.1.
|
2022-12-30 23:29:03 +08:00 |
|
Daniel Povey
|
da0623aa7f
|
Add another balancer to ZipformerEncoderLayer, prior to output.
|
2022-12-30 14:35:49 +08:00 |
|
Daniel Povey
|
59be36181c
|
Replace ActivationBalancer with Balancer
|
2022-12-29 20:34:46 +08:00 |
|
Daniel Povey
|
c6bad1ee4f
|
Start ff modules with larger initial_scale
|
2022-12-29 18:50:12 +08:00 |
|
Daniel Povey
|
fbdb12cf77
|
Remove ZipformerEncoder.norm
|
2022-12-29 16:00:34 +08:00 |
|
Daniel Povey
|
0de1184c6d
|
Fix min_abs for AttentionSqueeze
|
2022-12-29 15:24:13 +08:00 |
|
Daniel Povey
|
03e1f7dc01
|
Multiply min_abs values in line of encoder residuals by 4.
|
2022-12-29 12:49:04 +08:00 |
|
Daniel Povey
|
71d7843654
|
Re-introduce bias into BasicNorm and replace eps with log_scale.
|
2022-12-26 21:22:00 +08:00 |
|
Daniel Povey
|
920ed685ac
|
Change how bypass_scale works, src = src * bypass_scale + src_orig * (1.0 - bypass_scale)
|
2022-12-26 14:27:16 +08:00 |
|
Daniel Povey
|
11f5454b6a
|
Increase eps_max in norm_final from 3 to 4.
|
2022-12-26 13:47:23 +08:00 |
|
Daniel Povey
|
3d6ee443e3
|
Revert some recent changes that may not have been helpful.
|
2022-12-24 21:17:43 +08:00 |
|
Daniel Povey
|
43f2a8d50b
|
Make norm_final apply to delta, not src
|
2022-12-24 18:44:42 +08:00 |
|
Daniel Povey
|
72420eef10
|
Change final layerdrop_rate from 0.0 to 0.015.
|
2022-12-23 21:56:13 +08:00 |
|
Daniel Povey
|
2e0f4de8ff
|
Apply limit on BasicNorm.eps more effectively using limit_param_value; add final norm to Zipformer.
|
2022-12-23 15:59:51 +08:00 |
|
Daniel Povey
|
cff350d8de
|
Merge branch 'scaled_adam_exp760' into scaled_adam_exp765
|
2022-12-23 13:08:09 +08:00 |
|
Daniel Povey
|
edd6e0faf1
|
Add whitening on ConvNeXt module outputs; change grad_scale on whiten of Conv2dSubsampling.
|
2022-12-23 11:35:11 +08:00 |
|
Daniel Povey
|
ade7db54e3
|
Revert BasicNorm to its previous status, without the bias
|
2022-12-22 23:47:21 +08:00 |
|
Daniel Povey
|
b2125535fb
|
Remove mistaken factor of 4.0
|
2022-12-22 23:19:16 +08:00 |
|