135 Commits

Author SHA1 Message Date
Daniel Povey
fa696e919b Add memory to model 2023-05-01 20:47:09 +08:00
yaozengwei
55a1abc9da separate Conv2dSubsampling from Zipformer 2023-04-27 10:11:47 +08:00
yaozengwei
2e80841790 set --lr-batches=7500 2023-04-24 15:49:41 +08:00
yaozengwei
d27e61170b set --base-lr=0.045 as default 2023-04-12 19:12:07 +08:00
Daniel Povey
b526f3af00 Increase num layers 2023-04-04 15:39:32 +08:00
Daniel Povey
c4f669ef00 Increase feedforward dims and num layers 2023-04-04 14:41:23 +08:00
Daniel Povey
7ab1e7f5ec Combine two layers into one. 2023-04-04 12:14:18 +08:00
Daniel Povey
f59da65d82 Remove some more unused code; rename BasicNorm->BiasNorm, Zipformer->Zipformer2 2023-03-06 14:27:11 +08:00
Daniel Povey
686e7e8828 Remove some unhelpful or unused options in decode.py, setting equivalent to --left-context=0
for padding.  Restore default of causal training.
2023-02-13 12:58:33 +08:00
Daniel Povey
dc481ca419 Disable causal training; add balancers in decoder. 2023-02-11 23:10:21 +08:00
Daniel Povey
f9f546968c Revert warmup_batches change; make code change to avoid non in attn_weights 2023-02-11 18:46:05 +08:00
Daniel Povey
b0c87a93d2 Increase warmup of LR from 500 to 1000 batches 2023-02-11 18:27:20 +08:00
Daniel Povey
329175c897 Change how chunk-size is specified 2023-02-11 14:35:31 +08:00
Daniel Povey
e7e7560bba Implement chunking 2023-02-10 15:02:29 +08:00
Daniel Povey
167b58baa0 Make output dim of Zipformer be max dim 2023-01-14 14:29:29 +08:00
Daniel Povey
fb7a967276 Increase unmasked dims 2023-01-13 17:38:11 +08:00
Daniel Povey
bebc27f274 Increasing encoder-dim of some layers, and unmasked-dim 2023-01-13 17:36:45 +08:00
Daniel Povey
e6af583ee1 Increase encoder-dim of slowest stack from 320 to 384 2023-01-13 14:40:42 +08:00
Daniel Povey
a88587dc8a Fix comment; have 6, not 4, layers in most-downsampled stack. 2023-01-13 00:12:46 +08:00
Daniel Povey
bac72718f0 Bug fixes, config changes 2023-01-12 22:11:42 +08:00
Daniel Povey
1e04c3d892 Reduce dimension for speed, have varying dims 2023-01-12 21:15:39 +08:00
Daniel Povey
c7107ead64 Fix bug in get_adjusted_batch_count 2023-01-07 17:45:22 +08:00
Daniel Povey
9242800d42 Remove the 8x-subsampled stack 2023-01-07 12:59:57 +08:00
Daniel Povey
ef48019d6e Reduce feedforward-dims 2023-01-06 22:26:58 +08:00
Daniel Povey
6a762914bf Increase base-lr from 0.05 t to 0.055 2023-01-06 13:35:57 +08:00
Daniel Povey
90c02b471c Revert base LR to 0.05 2023-01-05 16:27:43 +08:00
Daniel Povey
067b861c70 Use largest LR for printing 2023-01-05 14:46:15 +08:00
Daniel Povey
6c7fd8c046 Increase base-lr to 0.06 2023-01-05 14:23:59 +08:00
Daniel Povey
0d7161ebec Use get_parameter_groups_with_lr in train.py; bug fixes 2023-01-05 14:11:33 +08:00
Daniel Povey
b7be18c2f8 Keep only needed changes from Liyong's branch 2023-01-05 12:23:32 +08:00
Daniel Povey
096ebeaf23 take a couple files from liyong's branch 2023-01-05 12:01:42 +08:00
Daniel Povey
829e4bd4db Bug fix in save-bad-model code 2022-12-21 15:33:58 +08:00
Daniel Povey
266e71cc79 Save checkpoint on failure. 2022-12-21 15:09:16 +08:00
Daniel Povey
d2b272ab50 Add back 2 conformer layers in 1st stack. 2022-12-20 13:54:06 +08:00
Daniel Povey
28cac1c2dc Merge debugging changes to optimizer. 2022-12-20 13:01:50 +08:00
Daniel Povey
b546ac866c Merge change from 726, set batch count at start of loop for repeatability. 2022-12-20 11:48:50 +08:00
Daniel Povey
2cc5bc18be Merge branch 'scaled_adam_exp731' into scaled_adam_exp737
# Conflicts:
#	egs/librispeech/ASR/pruned_transducer_stateless7/zipformer.py
2022-12-20 00:04:49 +08:00
Daniel Povey
f439399ced Adjust batch count w.r.t. reference duration 2022-12-18 14:25:23 +08:00
Daniel Povey
0341ff1ec5 One more convnext layer, two fewer conformer layers. 2022-12-17 22:00:58 +08:00
Daniel Povey
286b2021c2 Convert batch index to int 2022-12-17 16:31:45 +08:00
Daniel Povey
2c0cec86a3 Set batch count less frequently 2022-12-17 16:31:24 +08:00
Daniel Povey
912adfff7c Increase all ff dims by 256 2022-12-08 21:11:58 +08:00
Daniel Povey
6e598cb18d Reduce top grad_scale limit from 128 to 32. 2022-12-08 18:36:29 +08:00
Daniel Povey
3f82ee0783 Merge dropout schedule, 0.3 ... 0.1 over 20k batches 2022-12-08 18:18:46 +08:00
Daniel Povey
63e881f89b Pass in dropout from train.py 2022-12-05 23:49:40 +08:00
Daniel Povey
0da228c587 Restore the computation of valid stats. 2022-12-05 19:50:25 +08:00
Daniel Povey
7999dd0dbe Introduce scalar multiplication and change rules for updating gradient scale. 2022-12-05 16:15:20 +08:00
Daniel Povey
12e8c3f0fa One more layer on input 2022-11-29 16:47:24 +08:00
Daniel Povey
87ef4078d3 Add two more layers. 2022-11-28 13:56:40 +08:00
Daniel Povey
f483f1e0ef Implement attention weights sharing for successive layers, for Zipformer 2022-11-28 13:41:11 +08:00