Daniel Povey
|
6caaa4e9c6
|
Bug fix in caching_eval, may make no difference.
|
2022-12-15 23:32:29 +08:00 |
|
Daniel Povey
|
f5d4fb092d
|
Bug fix in caching_eval
|
2022-12-15 23:24:36 +08:00 |
|
Daniel Povey
|
d26ee2bf81
|
Try to implement caching evaluation for memory efficient training
|
2022-12-15 23:06:40 +08:00 |
|
Daniel Povey
|
f66c1600f4
|
Bug fix to printing code
|
2022-12-15 21:55:23 +08:00 |
|
Daniel Povey
|
076b18db60
|
Implement Nextformer-style frontend
|
2022-12-15 21:48:32 +08:00 |
|
Daniel Povey
|
864ff96322
|
Remove nonlin_skip_rate, introduce conv_skip_rate.
|
2022-12-15 19:27:29 +08:00 |
|
Daniel Povey
|
1506b83c7b
|
Change nonlin_skip_rate to be conv_skip_rate.
|
2022-12-15 19:25:21 +08:00 |
|
Daniel Povey
|
37a8c30136
|
Merge branch 'scaled_adam_exp699' into scaled_adam_exp711
|
2022-12-15 00:24:56 +08:00 |
|
Daniel Povey
|
25834453db
|
Merge branch 'scaled_adam_exp698' into scaled_adam_exp710
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
|
2022-12-15 00:21:31 +08:00 |
|
Daniel Povey
|
9e79b296f2
|
Merge branch 'scaled_adam_exp708' into scaled_adam_exp709
|
2022-12-14 22:56:09 +08:00 |
|
Daniel Povey
|
aac9bebc62
|
Bug fix
|
2022-12-14 22:54:59 +08:00 |
|
Daniel Povey
|
9bc326a9b6
|
Merge branch 'scaled_adam_exp705' into scaled_adam_exp709
|
2022-12-14 21:41:50 +08:00 |
|
Daniel Povey
|
159f37ddeb
|
Merge branch 'scaled_adam_exp700' into scaled_adam_exp709
|
2022-12-14 21:41:43 +08:00 |
|
Daniel Povey
|
cec2162a17
|
Merge branch 'scaled_adam_exp703' into scaled_adam_exp709
|
2022-12-14 21:41:32 +08:00 |
|
Daniel Povey
|
87df9f3215
|
Simplify schedules of output balancers for nonlin_attention_module and attention_squeeze.
|
2022-12-14 21:37:32 +08:00 |
|
Daniel Povey
|
930f1b8948
|
Reduce conv_module balancer2 min_abs from 0.75 to 0.5.
|
2022-12-13 23:01:49 +08:00 |
|
Daniel Povey
|
48445f22e4
|
Increase ratio from 2.0 to 3.0 on 2 whitening schedules
|
2022-12-13 22:50:21 +08:00 |
|
Daniel Povey
|
157f4074a2
|
Halve min_positive schedule of ConvolutionModule.
|
2022-12-13 21:41:15 +08:00 |
|
Daniel Povey
|
57040e382a
|
Set all aux-loss probs to zero.
|
2022-12-13 19:25:08 +08:00 |
|
Daniel Povey
|
52d18e405e
|
Change to balancer2 schedule of NonlinAttentionModule, remove peak at 8k.
|
2022-12-13 19:22:43 +08:00 |
|
Daniel Povey
|
117d418e27
|
Make nonlin_skip_rate nonzero and end after 20k iters; remove peak at 8k iteras of NonlinAttentionModule balancer2 min_abs.
|
2022-12-13 19:17:38 +08:00 |
|
Daniel Povey
|
8231350ac4
|
Make AttentionSqueeze dim smaller, at embed_dim // 2.
|
2022-12-13 18:54:46 +08:00 |
|
Daniel Povey
|
22204450db
|
Make min_abs of AttentionSqueeze smaller, the same as nonlin_attention_module
|
2022-12-13 18:51:22 +08:00 |
|
Daniel Povey
|
8d75006d69
|
Merge branch 'scaled_adam_exp690' into scaled_adam_exp694
|
2022-12-13 18:48:05 +08:00 |
|
Daniel Povey
|
d2465492f9
|
Bug fix
|
2022-12-12 23:32:08 +08:00 |
|
Daniel Povey
|
b5e0676f14
|
Invoke the out_balancer of attention_squeeze
|
2022-12-12 23:31:22 +08:00 |
|
Daniel Povey
|
0522425ea8
|
Change min and max positive
|
2022-12-12 23:30:12 +08:00 |
|
Daniel Povey
|
7920fa7726
|
Add out_balancer for attention_squeeze, similar to nonlin_attention_module.
|
2022-12-12 23:29:42 +08:00 |
|
Daniel Povey
|
7de7753ea2
|
Change DoubleSwish to SwooshR in Conv2dSubsampling, double max_abs limits.
|
2022-12-12 15:58:36 +08:00 |
|
Daniel Povey
|
f4ff6188d9
|
Set max_abs values on Conv2dSubsampling module.
|
2022-12-11 19:29:35 +08:00 |
|
Daniel Povey
|
a01fc3b220
|
Change attentionSqueeze dim from 128 to 256.
|
2022-12-11 19:12:03 +08:00 |
|
Daniel Povey
|
05c7cb5c83
|
Reduce attention_squeeze dim from 512 to 128.
|
2022-12-11 18:51:01 +08:00 |
|
Daniel Povey
|
634f1a4b82
|
Hardcode AttentionSqueeze dim at 512.
|
2022-12-11 17:20:52 +08:00 |
|
Daniel Povey
|
2d0fe7637c
|
Memory fix in WithLoss
|
2022-12-11 17:20:26 +08:00 |
|
Daniel Povey
|
0edaf4d25c
|
Merge branch 'scaled_adam_exp667' into scaled_adam_exp671
|
2022-12-10 19:39:02 +08:00 |
|
Daniel Povey
|
d7dd3f6dac
|
Merge branch 'scaled_adam_exp662' into scaled_adam_exp670
|
2022-12-10 18:04:21 +08:00 |
|
Daniel Povey
|
cb12014c31
|
Implement dropout for scores in AttentionDownsample
|
2022-12-10 16:09:51 +08:00 |
|
Daniel Povey
|
2f617fec43
|
Set nonlin_skip_rate to zero; make final min_abs value smaller in balancer2 of NonlinAttentionModule.
|
2022-12-10 00:21:51 +08:00 |
|
Daniel Povey
|
30c6e5b929
|
Make attention_squeeze use full dim.
|
2022-12-10 00:08:38 +08:00 |
|
Daniel Povey
|
0fc646f281
|
Merge branch 'scaled_adam_exp663' into scaled_adam_exp665
|
2022-12-10 00:07:37 +08:00 |
|
Daniel Povey
|
d35eb7a3a6
|
Add cosmetic/diagnostics changes from scaled_adam_exp656.
|
2022-12-09 22:02:42 +08:00 |
|
Daniel Povey
|
958d9b929d
|
Double limit of penalize_abs_values_gt in AttentionDownsample from 10 to 20.
|
2022-12-09 21:00:24 +08:00 |
|
Daniel Povey
|
a00ed7e976
|
Decrease min_abs of NonlinAttentionModule from 0.75 to 0.5; make its max_abs (not active) a constant.
|
2022-12-09 20:05:50 +08:00 |
|
Daniel Povey
|
a92df3e850
|
Reduce final min_abs on conv_module from 1.0 to 0.75.
|
2022-12-09 20:03:37 +08:00 |
|
Daniel Povey
|
31f2f95f59
|
reduce min_abs of ff module from 1.0 to 0.75
|
2022-12-09 20:01:19 +08:00 |
|
Daniel Povey
|
5c0957d950
|
Fix memory issue in ActivationBalancer
|
2022-12-09 18:11:27 +08:00 |
|
Daniel Povey
|
2ef0228db0
|
Make the ActivationBalancer relative to the mean, limited to -min_abs..max_abs
|
2022-12-09 17:59:00 +08:00 |
|
Daniel Povey
|
912adfff7c
|
Increase all ff dims by 256
|
2022-12-08 21:11:58 +08:00 |
|
Daniel Povey
|
75a1e05e49
|
Introduce nonlin_skip_rate
|
2022-12-08 20:35:38 +08:00 |
|
Daniel Povey
|
1718b2de44
|
Merge branch 'scaled_adam_exp647' into scaled_adam_exp652
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
|
2022-12-08 20:35:02 +08:00 |
|