yaozengwei
|
d27e61170b
|
set --base-lr=0.045 as default
|
2023-04-12 19:12:07 +08:00 |
|
Daniel Povey
|
b526f3af00
|
Increase num layers
|
2023-04-04 15:39:32 +08:00 |
|
Daniel Povey
|
c4f669ef00
|
Increase feedforward dims and num layers
|
2023-04-04 14:41:23 +08:00 |
|
Daniel Povey
|
7ab1e7f5ec
|
Combine two layers into one.
|
2023-04-04 12:14:18 +08:00 |
|
Daniel Povey
|
f59da65d82
|
Remove some more unused code; rename BasicNorm->BiasNorm, Zipformer->Zipformer2
|
2023-03-06 14:27:11 +08:00 |
|
Daniel Povey
|
686e7e8828
|
Remove some unhelpful or unused options in decode.py, setting equivalent to --left-context=0
for padding. Restore default of causal training.
|
2023-02-13 12:58:33 +08:00 |
|
Daniel Povey
|
dc481ca419
|
Disable causal training; add balancers in decoder.
|
2023-02-11 23:10:21 +08:00 |
|
Daniel Povey
|
f9f546968c
|
Revert warmup_batches change; make code change to avoid non in attn_weights
|
2023-02-11 18:46:05 +08:00 |
|
Daniel Povey
|
b0c87a93d2
|
Increase warmup of LR from 500 to 1000 batches
|
2023-02-11 18:27:20 +08:00 |
|
Daniel Povey
|
329175c897
|
Change how chunk-size is specified
|
2023-02-11 14:35:31 +08:00 |
|
Daniel Povey
|
e7e7560bba
|
Implement chunking
|
2023-02-10 15:02:29 +08:00 |
|
Daniel Povey
|
167b58baa0
|
Make output dim of Zipformer be max dim
|
2023-01-14 14:29:29 +08:00 |
|
Daniel Povey
|
fb7a967276
|
Increase unmasked dims
|
2023-01-13 17:38:11 +08:00 |
|
Daniel Povey
|
bebc27f274
|
Increasing encoder-dim of some layers, and unmasked-dim
|
2023-01-13 17:36:45 +08:00 |
|
Daniel Povey
|
e6af583ee1
|
Increase encoder-dim of slowest stack from 320 to 384
|
2023-01-13 14:40:42 +08:00 |
|
Daniel Povey
|
a88587dc8a
|
Fix comment; have 6, not 4, layers in most-downsampled stack.
|
2023-01-13 00:12:46 +08:00 |
|
Daniel Povey
|
bac72718f0
|
Bug fixes, config changes
|
2023-01-12 22:11:42 +08:00 |
|
Daniel Povey
|
1e04c3d892
|
Reduce dimension for speed, have varying dims
|
2023-01-12 21:15:39 +08:00 |
|
Daniel Povey
|
c7107ead64
|
Fix bug in get_adjusted_batch_count
|
2023-01-07 17:45:22 +08:00 |
|
Daniel Povey
|
9242800d42
|
Remove the 8x-subsampled stack
|
2023-01-07 12:59:57 +08:00 |
|
Daniel Povey
|
ef48019d6e
|
Reduce feedforward-dims
|
2023-01-06 22:26:58 +08:00 |
|
Daniel Povey
|
6a762914bf
|
Increase base-lr from 0.05 t to 0.055
|
2023-01-06 13:35:57 +08:00 |
|
Daniel Povey
|
90c02b471c
|
Revert base LR to 0.05
|
2023-01-05 16:27:43 +08:00 |
|
Daniel Povey
|
067b861c70
|
Use largest LR for printing
|
2023-01-05 14:46:15 +08:00 |
|
Daniel Povey
|
6c7fd8c046
|
Increase base-lr to 0.06
|
2023-01-05 14:23:59 +08:00 |
|
Daniel Povey
|
0d7161ebec
|
Use get_parameter_groups_with_lr in train.py; bug fixes
|
2023-01-05 14:11:33 +08:00 |
|
Daniel Povey
|
b7be18c2f8
|
Keep only needed changes from Liyong's branch
|
2023-01-05 12:23:32 +08:00 |
|
Daniel Povey
|
096ebeaf23
|
take a couple files from liyong's branch
|
2023-01-05 12:01:42 +08:00 |
|
Daniel Povey
|
829e4bd4db
|
Bug fix in save-bad-model code
|
2022-12-21 15:33:58 +08:00 |
|
Daniel Povey
|
266e71cc79
|
Save checkpoint on failure.
|
2022-12-21 15:09:16 +08:00 |
|
Daniel Povey
|
d2b272ab50
|
Add back 2 conformer layers in 1st stack.
|
2022-12-20 13:54:06 +08:00 |
|
Daniel Povey
|
28cac1c2dc
|
Merge debugging changes to optimizer.
|
2022-12-20 13:01:50 +08:00 |
|
Daniel Povey
|
b546ac866c
|
Merge change from 726, set batch count at start of loop for repeatability.
|
2022-12-20 11:48:50 +08:00 |
|
Daniel Povey
|
2cc5bc18be
|
Merge branch 'scaled_adam_exp731' into scaled_adam_exp737
# Conflicts:
# egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
|
2022-12-20 00:04:49 +08:00 |
|
Daniel Povey
|
f439399ced
|
Adjust batch count w.r.t. reference duration
|
2022-12-18 14:25:23 +08:00 |
|
Daniel Povey
|
0341ff1ec5
|
One more convnext layer, two fewer conformer layers.
|
2022-12-17 22:00:58 +08:00 |
|
Daniel Povey
|
286b2021c2
|
Convert batch index to int
|
2022-12-17 16:31:45 +08:00 |
|
Daniel Povey
|
2c0cec86a3
|
Set batch count less frequently
|
2022-12-17 16:31:24 +08:00 |
|
Daniel Povey
|
912adfff7c
|
Increase all ff dims by 256
|
2022-12-08 21:11:58 +08:00 |
|
Daniel Povey
|
6e598cb18d
|
Reduce top grad_scale limit from 128 to 32.
|
2022-12-08 18:36:29 +08:00 |
|
Daniel Povey
|
3f82ee0783
|
Merge dropout schedule, 0.3 ... 0.1 over 20k batches
|
2022-12-08 18:18:46 +08:00 |
|
Daniel Povey
|
63e881f89b
|
Pass in dropout from train.py
|
2022-12-05 23:49:40 +08:00 |
|
Daniel Povey
|
0da228c587
|
Restore the computation of valid stats.
|
2022-12-05 19:50:25 +08:00 |
|
Daniel Povey
|
7999dd0dbe
|
Introduce scalar multiplication and change rules for updating gradient scale.
|
2022-12-05 16:15:20 +08:00 |
|
Daniel Povey
|
12e8c3f0fa
|
One more layer on input
|
2022-11-29 16:47:24 +08:00 |
|
Daniel Povey
|
87ef4078d3
|
Add two more layers.
|
2022-11-28 13:56:40 +08:00 |
|
Daniel Povey
|
f483f1e0ef
|
Implement attention weights sharing for successive layers, for Zipformer
|
2022-11-28 13:41:11 +08:00 |
|
Daniel Povey
|
a6fb9772a8
|
Remove 4 layers.
|
2022-11-27 13:29:29 +08:00 |
|
Daniel Povey
|
f71b1d2c3a
|
Add 4 more layers
|
2022-11-26 21:18:24 +08:00 |
|
Daniel Povey
|
320c58401f
|
Increase 2 feedforward dims from 1.5k to 2k.
|
2022-11-26 19:45:41 +08:00 |
|