Daniel Povey
fc873cc50d
Make epsilon in BasicNorm learnable, optionally.
2022-03-15 17:00:17 +08:00
Daniel Povey
1962fe298b
Add deriv-balancer at output of embedding.
2022-03-15 14:35:15 +08:00
Daniel Povey
86e5dcba11
Remove max-positive constraint in deriv-balancing; add second DerivBalancer in conv module.
2022-03-15 13:10:35 +08:00
Daniel Povey
788963d40a
Merge branch 'randcombine1_expscale3_rework2c_maxabs1000_maxp0.95_noexp' into randcombine1_expscale3_rework2c_maxabs1000_maxp0.95_noexp_convderiv
2022-03-14 14:37:40 +08:00
Daniel Povey
ae25688253
Make DoubleSwish more memory efficient
2022-03-14 11:02:32 +08:00
Daniel Povey
437e8b2083
Reduce max-abs limit from 1000 to 100; introduce 2 DerivBalancer modules in conv layer.
2022-03-13 23:31:08 +08:00
Daniel Povey
f351777e9c
Remove ExpScale in feedforward layes.
2022-03-13 17:29:39 +08:00
Daniel Povey
5d69acb25b
Add max-abs-value
2022-03-13 13:15:20 +08:00
Daniel Povey
e6a501d3c8
Add max-abs-value constraint in DerivBalancer
2022-03-13 11:52:13 +08:00
Daniel Povey
2117f46361
DoubleSwish fix
2022-03-12 19:02:14 +08:00
Daniel Povey
be0a79cbca
Replace ExpScaleRelu with DoubleSwish()
2022-03-12 19:00:48 +08:00
Daniel Povey
a24572abd1
Bug-fix RE bias
2022-03-12 17:28:43 +08:00
Daniel Povey
a392cb9fbc
Reduce initial scaling of modules
2022-03-12 16:53:03 +08:00
Daniel Povey
ca8cf2a73b
Another rework, use scales on linear/conv
2022-03-12 15:38:13 +08:00
Daniel Povey
2d3a76292d
Set scaling on SwishExpScale
2022-03-11 20:12:45 +08:00
Daniel Povey
98156711ef
Introduce in_scale=0.5 for SwishExpScale
2022-03-11 19:07:34 +08:00
Daniel Povey
a0d5e2932c
Reduce min_abs from 0.5 to 0.2
2022-03-11 18:17:49 +08:00
Daniel Povey
bec33e6855
init 1st conv module to smaller variance
2022-03-11 16:37:17 +08:00
Daniel Povey
bcf417fce2
Change max_factor in DerivBalancer from 0.025 to 0.01; fix scaling code.
2022-03-11 14:47:46 +08:00
Daniel Povey
137eae0b95
Reduce max_factor to 0.01
2022-03-11 14:42:17 +08:00
Daniel Povey
e3e14cf7a4
Change min-abs threshold from 0.2 to 0.5
2022-03-11 14:16:33 +08:00
Daniel Povey
76560f255c
Add min-abs-value 0.2
2022-03-10 23:48:46 +08:00
Daniel Povey
2fa9c636a4
use nonzero threshold in DerivBalancer
2022-03-10 23:24:55 +08:00
Daniel Povey
b55472bb42
Replace most normalizations with scales (still have norm in conv)
2022-03-10 14:43:54 +08:00
Daniel Povey
059b57ad37
Add BasicNorm module
2022-03-10 14:32:05 +08:00
Daniel Povey
e2ace9d545
Replace norm on input layer with scale of 0.1.
2022-03-07 11:24:04 +08:00
Daniel Povey
a37d98463a
Restore ConvolutionModule to state before changes; change all Swish,Swish(Swish) to SwishOffset.
2022-03-06 11:55:02 +08:00
Daniel Povey
8a8b81cd18
Replace relu with swish-squared.
2022-03-05 22:21:42 +08:00
Daniel Povey
5f2c0a09b7
Convert swish nonlinearities to ReLU
2022-03-05 16:28:24 +08:00
Daniel Povey
65b09dd5f2
Double the threshold in brelu; slightly increase max_factor.
2022-03-05 00:07:14 +08:00
Daniel Povey
6252282fd0
Add deriv-balancing code
2022-03-04 20:19:11 +08:00
Daniel Povey
eb3ed54202
Reduce scale from 50 to 20
2022-03-04 15:56:45 +08:00
Daniel Povey
7e88999641
Increase scale from 20 to 50.
2022-03-04 14:31:29 +08:00
Daniel Povey
3207bd98a9
Increase scale on Scale from 4 to 20
2022-03-04 13:16:40 +08:00
Daniel Povey
3d9ddc2016
Fix backprop bug
2022-03-04 12:29:44 +08:00
Daniel Povey
bc6c720e25
Combine ExpScale and swish for memory reduction
2022-03-04 10:52:05 +08:00
Daniel Povey
23b3aa233c
Double learning rate of exp-scale units
2022-03-04 00:42:37 +08:00
Daniel Povey
5c177fc52b
pelu_base->expscale, add 2xExpScale in subsampling, and in feedforward units.
2022-03-03 23:52:03 +08:00
Daniel Povey
3fb559d2f0
Add baseline for the PeLU expt, keeping only the small normalization-related changes.
2022-03-02 18:27:08 +08:00
Daniel Povey
9ed7d55a84
Small bug fixes/imports
2022-03-02 16:34:55 +08:00
Daniel Povey
9d1b4ae046
Add pelu to this good-performing setup..
2022-03-02 16:33:27 +08:00
Fangjun Kuang
a80e58e15d
Refactor decode.py to make it more readable and more modular. ( #44 )
...
* Refactor decode.py to make it more readable and more modular.
* Fix an error.
Nbest.fsa should always have token IDs as labels and
word IDs as aux_labels.
* Add nbest decoding.
* Compute edit distance with k2.
* Refactor nbest-oracle.
* Add rescore with nbest lists.
* Add whole-lattice rescoring.
* Add rescoring with attention decoder.
* Refactoring.
* Fixes after refactoring.
* Fix a typo.
* Minor fixes.
* Replace [] with () for shapes.
* Use k2 v1.9
* Use Levenshtein graphs/alignment from k2 v1.9
* [doc] Require k2 >= v1.9
* Minor fixes.
2021-09-20 15:44:54 +08:00
pkufool
19c4214958
Fix code style and add copyright. ( #18 )
...
* Fix style and add copyright
* Minor fix
* Remove duplicate lines
* Reformat conformer.py by black
* Reformat code style with black.
* Fix github workflows
* Fix lhotse installation
* Install icefall requirements
* Update k2 version, remove lhotse from test workflow
2021-08-23 10:43:59 +08:00
Fangjun Kuang
5a0b9bcb23
Refactoring ( #4 )
...
* Fix an error in TDNN-LSTM training.
* WIP: Refactoring
* Refactor transformer.py
* Remove unused code.
* Minor fixes.
2021-08-04 14:53:02 +08:00