Daniel Povey
87c92efbfe
Changes from upstream/master
2022-03-16 21:49:15 +08:00
Daniel Povey
e838c192ef
Cosmetic changes/renaming things
2022-03-16 19:27:45 +08:00
Daniel Povey
dfc75752c4
Remove some dead code.
2022-03-16 18:06:01 +08:00
Daniel Povey
c82db4184a
Remove xscale from pos_embedding
2022-03-16 15:50:11 +08:00
Daniel Povey
6561743d7b
bug fix re sqrt
2022-03-16 14:55:17 +08:00
Daniel Povey
0e9cad3f1f
Modifying initialization from normal->uniform; add initial_scale when initializing
2022-03-16 14:42:53 +08:00
Daniel Povey
00be56c7a0
Remove dead code
2022-03-16 12:49:00 +08:00
Daniel Povey
a783b96467
Fix typo
2022-03-16 12:43:44 +08:00
Daniel Povey
633213424d
Rework of initialization
2022-03-16 12:42:59 +08:00
Daniel Povey
1331199530
Merge branch 'specaugmod_baseline' into randcombine1_expscale3_rework2c_maxabs1000_maxp0.95_noexp_convderiv2warmup_scale_0mean
2022-03-15 23:47:03 +08:00
Daniel Povey
261d7602a7
Draft of 0mean changes..
2022-03-15 23:46:53 +08:00
Daniel Povey
fc873cc50d
Make epsilon in BasicNorm learnable, optionally.
2022-03-15 17:00:17 +08:00
Daniel Povey
b2abcd721a
Add more stats.
2022-03-15 16:38:19 +08:00
Daniel Povey
1962fe298b
Add deriv-balancer at output of embedding.
2022-03-15 14:35:15 +08:00
Daniel Povey
2e6d170be8
Merge branch 'specaugmod_baseline' into randcombine1_expscale3_rework2c_maxabs1000_maxp0.95_noexp_convderiv3warmup_embed
2022-03-15 14:33:08 +08:00
Daniel Povey
21ebd356e7
Add some extra info to diagnostics
2022-03-15 13:49:15 +08:00
Daniel Povey
86e5dcba11
Remove max-positive constraint in deriv-balancing; add second DerivBalancer in conv module.
2022-03-15 13:10:35 +08:00
Daniel Povey
a23010fc10
Add warmup mode
2022-03-14 23:04:51 +08:00
Daniel Povey
8d17a05dd2
Reduce constraints from deriv-balancer in ConvModule.
2022-03-14 19:23:33 +08:00
Daniel Povey
788963d40a
Merge branch 'randcombine1_expscale3_rework2c_maxabs1000_maxp0.95_noexp' into randcombine1_expscale3_rework2c_maxabs1000_maxp0.95_noexp_convderiv
2022-03-14 14:37:40 +08:00
Daniel Povey
ae25688253
Make DoubleSwish more memory efficient
2022-03-14 11:02:32 +08:00
Mingshuang Luo
d0d806560f
Change for asr_datamodule.py ( #241 )
...
* change for asr_datamodule.py
* fix style check
* do a fix
2022-03-14 00:30:58 +08:00
Daniel Povey
437e8b2083
Reduce max-abs limit from 1000 to 100; introduce 2 DerivBalancer modules in conv layer.
2022-03-13 23:31:08 +08:00
Daniel Povey
f351777e9c
Remove ExpScale in feedforward layes.
2022-03-13 17:29:39 +08:00
Daniel Povey
97c0bb82d3
Change dir name
2022-03-13 13:19:20 +08:00
Daniel Povey
5d69acb25b
Add max-abs-value
2022-03-13 13:15:20 +08:00
Daniel Povey
e6a501d3c8
Add max-abs-value constraint in DerivBalancer
2022-03-13 11:52:13 +08:00
Daniel Povey
6042c96db2
Use learnable scales for joiner and decoder
2022-03-12 20:54:46 +08:00
Daniel Povey
2117f46361
DoubleSwish fix
2022-03-12 19:02:14 +08:00
Daniel Povey
be0a79cbca
Replace ExpScaleRelu with DoubleSwish()
2022-03-12 19:00:48 +08:00
Daniel Povey
db7a3b6eea
Reduce initial_scale.
2022-03-12 18:50:02 +08:00
Daniel Povey
b7b2d8970b
Cosmetic change
2022-03-12 17:47:35 +08:00
Daniel Povey
a24572abd1
Bug-fix RE bias
2022-03-12 17:28:43 +08:00
Daniel Povey
a392cb9fbc
Reduce initial scaling of modules
2022-03-12 16:53:03 +08:00
Fangjun Kuang
bb7f6ed6b7
Add modified beam search for pruned rnn-t. ( #248 )
...
* Add modified beam search for pruned rnn-t.
* Fix style issues.
* Update RESULTS.md.
* Fix typos.
* Minor fixes.
* Test the pre-trained model using GitHub actions.
* Let the user install optimized_transducer on her own.
* Fix errors in GitHub CI.
2022-03-12 16:16:55 +08:00
Fangjun Kuang
2f4e71f433
Add force alignment for stateless transducer. ( #239 )
...
* Add force alignment for stateless transducer.
* Add more documentation.
* Compute word starting time from framewise token alignment.
* Update README to include force alignment information.
* Fix typos.
* Fix more typos.
* Fixes after review.
2022-03-12 16:16:15 +08:00
Daniel Povey
d906bc2a4f
Change dir name
2022-03-12 15:38:39 +08:00
Daniel Povey
ca8cf2a73b
Another rework, use scales on linear/conv
2022-03-12 15:38:13 +08:00
Daniel Povey
0abba9e7a2
Fix self.post-scale-mha
2022-03-12 11:20:44 +08:00
Daniel Povey
76a2b9d362
Add learnable post-scale for mha
2022-03-12 11:19:49 +08:00
Daniel Povey
7eb5a84cbe
Add identity pre_norm_final for diagnostics.
2022-03-11 21:00:43 +08:00
Daniel Povey
2d3a76292d
Set scaling on SwishExpScale
2022-03-11 20:12:45 +08:00
Daniel Povey
cc558faf26
Fix scale from 0.5 to 2.0 as I really intended..
2022-03-11 19:11:50 +08:00
Daniel Povey
98156711ef
Introduce in_scale=0.5 for SwishExpScale
2022-03-11 19:07:34 +08:00
Daniel Povey
a0d5e2932c
Reduce min_abs from 0.5 to 0.2
2022-03-11 18:17:49 +08:00
Daniel Povey
5eafccb369
Change how scales are applied; fix residual bug
2022-03-11 17:46:33 +08:00
Daniel Povey
bec33e6855
init 1st conv module to smaller variance
2022-03-11 16:37:17 +08:00
Daniel Povey
bcf417fce2
Change max_factor in DerivBalancer from 0.025 to 0.01; fix scaling code.
2022-03-11 14:47:46 +08:00
Daniel Povey
2940d3106f
Fix q*scaling logic
2022-03-11 14:44:13 +08:00
Daniel Povey
137eae0b95
Reduce max_factor to 0.01
2022-03-11 14:42:17 +08:00