Daniel Povey
|
1669e21c0c
|
Use decorrelation in conformer layers also
|
2022-06-09 11:31:52 +08:00 |
|
Daniel Povey
|
b9a476c7bb
|
Remove loss factor from decorr_loss_scale
|
2022-06-08 20:19:17 +08:00 |
|
Daniel Povey
|
8e56445c70
|
Try to resolve graph-freed problem
|
2022-06-08 20:07:35 +08:00 |
|
Daniel Povey
|
46ca1cd4c4
|
Add Decorrelate module that adds something to gradients in backward pass
|
2022-06-08 19:44:58 +08:00 |
|
Daniel Povey
|
9fb8645168
|
Implement JoinDropout
|
2022-06-08 16:17:42 +08:00 |
|
Daniel Povey
|
e7886d49a9
|
Bug fix
|
2022-06-08 11:05:29 +08:00 |
|
Daniel Povey
|
a83bde1372
|
Simplify implementation as current idea was not working to decorrelate
|
2022-06-08 10:24:41 +08:00 |
|
Daniel Povey
|
135be1e19c
|
Change dropout_rate from 0.2 to 0.1; fix logging statement; fix assignment to rand_scales, nonrand_scales to use [:]
|
2022-06-08 00:42:04 +08:00 |
|
Daniel Povey
|
a6050cb2de
|
Implement new, more principled but maybe slower version.
|
2022-06-07 23:38:38 +08:00 |
|
Daniel Povey
|
75c822c7e9
|
Pre and post-multiply by inv_sqrt_stddev,stddev
|
2022-06-07 20:32:18 +08:00 |
|
Daniel Povey
|
a270973b69
|
Add gaussian version of decorrelation
|
2022-06-07 18:55:48 +08:00 |
|
Daniel Povey
|
5d24489752
|
Have 2 scales on dropout
|
2022-06-07 18:31:42 +08:00 |
|
Daniel Povey
|
cd6b707e2b
|
Various bug fixes
|
2022-06-07 16:45:32 +08:00 |
|
Daniel Povey
|
40a0934b4e
|
Implement GaussProjDrop
|
2022-06-07 11:51:24 +08:00 |
|
Daniel Povey
|
4352a16f57
|
Fix bug that relates to modifying U in place
|
2022-06-06 17:43:15 +08:00 |
|
Daniel Povey
|
31848dcd11
|
Randomize the projections
|
2022-06-06 16:07:28 +08:00 |
|
Daniel Povey
|
6fdb356315
|
Bug fix RE GPU device
|
2022-06-06 15:40:20 +08:00 |
|
Daniel Povey
|
71e927411a
|
Implement FixedProjDrop
|
2022-06-06 15:38:59 +08:00 |
|
Daniel Povey
|
28df3ba43f
|
Fix bug re half precision
|
2022-06-05 23:26:59 +08:00 |
|
Daniel Povey
|
d76aedb790
|
Make it work for half
|
2022-06-05 23:25:51 +08:00 |
|
Daniel Povey
|
e535887abb
|
Bug fixes.
|
2022-06-05 23:24:02 +08:00 |
|
Daniel Povey
|
136ffb0597
|
Add ProjDrop for axis-independent dropout
|
2022-06-05 23:00:48 +08:00 |
|
Fangjun Kuang
|
f6ce135608
|
Various fixes to support torch script. (#371)
* Various fixes to support torch script.
* Add tests to ensure that the model is torch scriptable.
* Update tests.
|
2022-05-16 21:46:59 +08:00 |
|
Guo Liyong
|
78418ac37c
|
fix comments
|
2022-04-13 13:09:24 +08:00 |
|
Mingshuang Luo
|
93c60a9d30
|
Code style check for librispeech pruned transducer stateless2 (#308)
|
2022-04-11 22:15:18 +08:00 |
|
Daniel Povey
|
2545237eb3
|
Changing initial_speed from 0.25 to 01
|
2022-04-05 18:00:54 +08:00 |
|
Daniel Povey
|
179d0605ea
|
Change initialization to 0.25
|
2022-04-04 23:34:39 +08:00 |
|
Daniel Povey
|
72f4a673b1
|
First draft of new approach to learning rates + init
|
2022-04-04 20:21:34 +08:00 |
|
Daniel Povey
|
37ab0bcfa5
|
Reduce speed of some components
|
2022-03-30 11:46:23 +08:00 |
|
Daniel Povey
|
11a04c50ae
|
Change 0.025,0.05 to 0.01 in initializations
|
2022-03-21 21:29:24 +08:00 |
|
Daniel Povey
|
8cff994cd7
|
Set also scale for embedding to 0.025.
|
2022-03-18 21:30:05 +08:00 |
|
Daniel Povey
|
188eada7ac
|
Change initial std from 0.05 to 0.025.
|
2022-03-18 21:28:34 +08:00 |
|
Daniel Povey
|
c9f1aeb7d1
|
Fix bug with import
|
2022-03-18 16:40:24 +08:00 |
|
Daniel Povey
|
ba3611cefd
|
Cosmetic changes to swish
|
2022-03-18 16:35:48 +08:00 |
|
Daniel Povey
|
6769087d70
|
Remove scale_speed, make swish deriv more efficient.
|
2022-03-18 16:31:25 +08:00 |
|
Daniel Povey
|
11bea4513e
|
Add remaining files in pruned_transducer_stateless2
|
2022-03-17 11:17:52 +08:00 |
|