Daniel Povey
|
3a71a53d8d
|
Set lr_factor on to_scores, max_abs=4.0 on balancer
|
2023-05-23 10:56:03 +08:00 |
|
Daniel Povey
|
45043e2e21
|
Merge branch 'zlm25' into zlm26
|
2023-05-20 22:24:15 +08:00 |
|
Daniel Povey
|
8dc070ce37
|
Increase all ff dims; decrease batch size.
|
2023-05-20 13:35:23 +08:00 |
|
Daniel Povey
|
c1de4cc847
|
Remove factor of 2 in weights_discarded
|
2023-05-19 20:13:12 +08:00 |
|
Daniel Povey
|
4a425f7eb5
|
Half the time, flip weights_discarded
|
2023-05-19 18:04:05 +08:00 |
|
Daniel Povey
|
7d162bf41e
|
mOve where srand called
|
2023-05-19 16:43:21 +08:00 |
|
Daniel Povey
|
f37ec0f0da
|
Include start batch in seed
|
2023-05-19 16:39:13 +08:00 |
|
Daniel Povey
|
5fc0cce553
|
Introduce factor of 2 to more strongly penalize discarded weights.
|
2023-05-19 16:31:45 +08:00 |
|
Daniel Povey
|
824d7b4492
|
Add evaluate.py
|
2023-05-19 11:58:32 +08:00 |
|
Daniel Povey
|
fb758b3540
|
Fix f-string bug
|
2023-05-18 22:29:13 +08:00 |
|
Daniel Povey
|
769033c857
|
Increase eps; make it added not applied as floor.
|
2023-05-18 20:08:19 +08:00 |
|
Daniel Povey
|
57a023902c
|
Remove flipping of weights; reduce eps.
|
2023-05-18 19:50:16 +08:00 |
|
Daniel Povey
|
c487f9a0ef
|
Try removing weight_scale
|
2023-05-18 18:41:39 +08:00 |
|
Daniel Povey
|
d2c198c072
|
Implement weight_scale, set weight_scale=10
|
2023-05-18 15:48:14 +08:00 |
|
Daniel Povey
|
f6c7392430
|
Bug fix
|
2023-05-18 15:37:33 +08:00 |
|
Daniel Povey
|
cdfa388ac0
|
Revert optim schedule
|
2023-05-18 15:35:23 +08:00 |
|
Daniel Povey
|
299482d02d
|
More debug print
|
2023-05-18 15:12:57 +08:00 |
|
Daniel Povey
|
e4a774cb98
|
Warm up lr more slowly
|
2023-05-18 15:03:44 +08:00 |
|
Daniel Povey
|
76e6726178
|
Implement random rotation of dims
|
2023-05-18 14:56:44 +08:00 |
|
Daniel Povey
|
d631ffec5b
|
indentation change
|
2023-05-18 14:49:56 +08:00 |
|
Daniel Povey
|
e976af699e
|
Remove unused variable
|
2023-05-18 14:17:31 +08:00 |
|
Daniel Povey
|
a514d23df7
|
Change how we penalize weights
|
2023-05-18 14:14:50 +08:00 |
|
Daniel Povey
|
26cf13a3e1
|
Revert batch size to 20
|
2023-05-18 14:04:14 +08:00 |
|
Daniel Povey
|
5cd2df0cd6
|
Increase batch size from 20 to 22
|
2023-05-18 13:57:26 +08:00 |
|
Daniel Povey
|
15aca1fb4a
|
Simplify dataloader code
|
2023-05-18 13:55:52 +08:00 |
|
Daniel Povey
|
9367ea3646
|
Don't drop last batch
|
2023-05-18 12:47:28 +08:00 |
|
Daniel Povey
|
eb64130787
|
Reverse zlm9..zlm12
|
2023-05-17 17:31:24 +08:00 |
|
Daniel Povey
|
5d7517e382
|
Set batch size back to 20
|
2023-05-17 14:56:38 +08:00 |
|
Daniel Povey
|
24e8a7a8fd
|
Remove pointless assertion
|
2023-05-17 14:54:29 +08:00 |
|
Daniel Povey
|
8fce9a05fc
|
Revert batch size to 18
|
2023-05-17 14:53:53 +08:00 |
|
Daniel Povey
|
844844a02d
|
Reduce batch size from 21 to 20
|
2023-05-17 14:28:56 +08:00 |
|
Daniel Povey
|
e25929c256
|
Reduce batch size to 21
|
2023-05-17 13:24:26 +08:00 |
|
Daniel Povey
|
62c34f15c6
|
Remove print statement
|
2023-05-17 13:22:02 +08:00 |
|
Daniel Povey
|
e4246f6ba3
|
Reduce batch size from 24 to 22
|
2023-05-17 13:20:23 +08:00 |
|
Daniel Povey
|
6dce7e251d
|
Increase batch size
|
2023-05-17 13:17:00 +08:00 |
|
Daniel Povey
|
53410608a6
|
Try to implement test mode; fix issue where middle stack had not been
downsampled.
|
2023-05-17 13:03:19 +08:00 |
|
Daniel Povey
|
30ace76fbc
|
Add depthwise conv to decoder
|
2023-05-17 11:26:41 +08:00 |
|
Daniel Povey
|
610b2270aa
|
Bug fixes
|
2023-05-16 23:08:13 +08:00 |
|
Daniel Povey
|
a405106d2f
|
Add 1-d convolution to text embedding module; reduce batch size
|
2023-05-16 20:05:52 +08:00 |
|
Daniel Povey
|
399a79ace6
|
Change chunk-size setup
|
2023-05-16 19:47:23 +08:00 |
|
Daniel Povey
|
a6eb45840a
|
Reduce batch size
|
2023-05-16 17:39:59 +08:00 |
|
Daniel Povey
|
e062c71076
|
Efficiency, small fix
|
2023-05-16 17:34:21 +08:00 |
|
Daniel Povey
|
cf93d1f129
|
Bug fix regarding chunk-size reshaping
|
2023-05-16 17:30:48 +08:00 |
|
Daniel Povey
|
5f5df4367d
|
Fix error in how src was reshaped
|
2023-05-16 17:19:47 +08:00 |
|
Daniel Povey
|
0412d19f50
|
Increase batch size
|
2023-05-16 16:33:17 +08:00 |
|
Daniel Povey
|
3f72813a96
|
Various bug fixes, implementing chunking
|
2023-05-16 16:27:09 +08:00 |
|
Daniel Povey
|
0006a4c4db
|
Implement chunk sizes, to the extent that the program runs.
|
2023-05-16 16:13:20 +08:00 |
|
Daniel Povey
|
4562b25a6a
|
Remove unused options
|
2023-05-16 14:25:19 +08:00 |
|
Daniel Povey
|
bfeeddda81
|
Reduce mem consumption of softmax backward
|
2023-05-16 12:18:09 +08:00 |
|
Daniel Povey
|
465d41c429
|
Increase batch size
|
2023-05-16 12:13:13 +08:00 |
|