Daniel Povey
137eae0b95
Reduce max_factor to 0.01
2022-03-11 14:42:17 +08:00
Daniel Povey
ab9a17413a
Scale up pos_bias_u and pos_bias_v before use.
2022-03-11 14:37:52 +08:00
Daniel Povey
e3e14cf7a4
Change min-abs threshold from 0.2 to 0.5
2022-03-11 14:16:33 +08:00
Daniel Povey
bfce5f63e4
Fix dirname
2022-03-10 23:49:09 +08:00
Daniel Povey
76560f255c
Add min-abs-value 0.2
2022-03-10 23:48:46 +08:00
Daniel Povey
2fa9c636a4
use nonzero threshold in DerivBalancer
2022-03-10 23:24:55 +08:00
Daniel Povey
425e274c82
Replace norm in ConvolutionModule with a scaling factor.
2022-03-10 16:01:53 +08:00
Daniel Povey
87b843f023
Change exp dir
2022-03-10 14:44:55 +08:00
Daniel Povey
b55472bb42
Replace most normalizations with scales (still have norm in conv)
2022-03-10 14:43:54 +08:00
Daniel Povey
059b57ad37
Add BasicNorm module
2022-03-10 14:32:05 +08:00
Daniel Povey
feb20ca84d
Merge changes to diagnostics
2022-03-10 10:31:42 +08:00
Daniel Povey
1e5455ba29
Update diagnostics
2022-03-10 10:28:48 +08:00
Daniel Povey
d074cf73c6
Extensions to diagnostics code
2022-03-09 20:37:20 +08:00
Daniel Povey
e2ace9d545
Replace norm on input layer with scale of 0.1.
2022-03-07 11:24:04 +08:00
Daniel Povey
a37d98463a
Restore ConvolutionModule to state before changes; change all Swish,Swish(Swish) to SwishOffset.
2022-03-06 11:55:02 +08:00
Daniel Povey
8a8b81cd18
Replace relu with swish-squared.
2022-03-05 22:21:42 +08:00
Fangjun Kuang
1603744469
Refactor conformer. ( #237 )
2022-03-05 19:26:06 +08:00
Daniel Povey
5f2c0a09b7
Convert swish nonlinearities to ReLU
2022-03-05 16:28:24 +08:00
Daniel Povey
0cd14ae739
Fix exp dir
2022-03-05 12:17:09 +08:00
Daniel Povey
65b09dd5f2
Double the threshold in brelu; slightly increase max_factor.
2022-03-05 00:07:14 +08:00
Daniel Povey
74f2b163de
Merge diagnostics improvement
2022-03-04 23:15:47 +08:00
Daniel Povey
6252282fd0
Add deriv-balancing code
2022-03-04 20:19:11 +08:00
Daniel Povey
eb3ed54202
Reduce scale from 50 to 20
2022-03-04 15:56:45 +08:00
Daniel Povey
9cc5999829
Fix duplicate Swish; replace norm+swish with swish+exp-scale in convolution module
2022-03-04 15:50:51 +08:00
yaozengwei
ad62981765
Add diagnostics ( #230 )
...
* Adding diagnostics code...
* Move diagnostics code from local dir to the shared icefall dir
* Remove the diagnostics code in the local dir
* Update docs of arguments, and remove stats_types() function in TensorDiagnosticOptions object.
* Update docs of arguments.
* Add copyright information.
* Corrected the time in copyright information.
Co-authored-by: Daniel Povey <dpovey@gmail.com>
2022-03-04 15:38:23 +08:00
Daniel Povey
7e88999641
Increase scale from 20 to 50.
2022-03-04 14:31:29 +08:00
Daniel Povey
3207bd98a9
Increase scale on Scale from 4 to 20
2022-03-04 13:16:40 +08:00
Daniel Povey
503f8d521c
Fix bug in diagnostics
2022-03-04 13:08:56 +08:00
Daniel Povey
3d9ddc2016
Fix backprop bug
2022-03-04 12:29:44 +08:00
Fangjun Kuang
2f0fbf430c
Remove duplicate files. ( #236 )
2022-03-04 11:56:31 +08:00
Daniel Povey
cd216f50b6
Add import
2022-03-04 11:03:01 +08:00
Daniel Povey
bc6c720e25
Combine ExpScale and swish for memory reduction
2022-03-04 10:52:05 +08:00
Daniel Povey
23b3aa233c
Double learning rate of exp-scale units
2022-03-04 00:42:37 +08:00
Daniel Povey
5c177fc52b
pelu_base->expscale, add 2xExpScale in subsampling, and in feedforward units.
2022-03-03 23:52:03 +08:00
Fangjun Kuang
3ec219dfa0
Add stateless transducer tutorial. ( #235 )
...
* WIP: Add stateless transducer tutorial.
* Add more doc.
* Minor fixes.
2022-03-03 22:33:47 +08:00
Daniel Povey
3fb559d2f0
Add baseline for the PeLU expt, keeping only the small normalization-related changes.
2022-03-02 18:27:08 +08:00
Fangjun Kuang
1ff6196c44
Fix joiner ( #234 )
...
* Add tests for Joiner
* Remove duplicate files.
2022-03-02 16:41:14 +08:00
Daniel Povey
9ed7d55a84
Small bug fixes/imports
2022-03-02 16:34:55 +08:00
Daniel Povey
9d1b4ae046
Add pelu to this good-performing setup..
2022-03-02 16:33:27 +08:00
Fangjun Kuang
50d2281524
Add modified transducer loss for AIShell dataset ( #219 )
...
* Add modified transducer for aishell.
* Minor fixes.
* Add extra data in transducer training.
The extra data is from http://www.openslr.org/62/
* Update export.py and pretrained.py
* Update CI to install pretrained models with aishell.
* Update results.
* Update results.
* Update README.
* Use symlinks to avoid copies.
2022-03-02 16:02:38 +08:00
Fangjun Kuang
05cb297858
Update result for full libri + GigaSpeech using transducer_stateless. ( #231 )
2022-03-01 17:01:46 +08:00
Fangjun Kuang
72f838dee1
Update results for transducer_stateless after training for more epochs. ( #207 )
2022-03-01 16:35:02 +08:00
Daniel Povey
2ff520c800
Improvements to diagnostics (RE those with 1 dim
2022-02-28 12:22:27 +08:00
Daniel Povey
c1063def95
First version of rand-combine iterated-training-like idea.
2022-02-27 17:34:58 +08:00
Daniel Povey
63d8d935d4
Refactor/simplify ConformerEncoder
2022-02-27 13:56:15 +08:00
Daniel Povey
581786a6d3
Adding diagnostics code...
2022-02-27 13:44:43 +08:00
PF Luo
ac7c2d84bc
minor fix for aishell recipe ( #223 )
...
* just remove unnecessary torch.sum
* minor fixs for aishell
2022-02-23 08:33:20 +08:00
Fangjun Kuang
2332ba312d
Begin to use multiple datasets in training ( #213 )
...
* Begin to use multiple datasets.
* Finish preparing training datasets.
* Minor fixes
* Copy files.
* Finish training code.
* Display losses for gigaspeech and librispeech separately.
* Fix decode.py
* Make the probability to select a batch from GigaSpeech configurable.
* Update results.
* Minor fixes.
2022-02-21 15:27:27 +08:00
Fangjun Kuang
1c35ae1dba
Reset seed at the beginning of each epoch. ( #221 )
...
* Reset seed at the beginning of each epoch.
* Use a different seed for each epoch.
2022-02-21 15:16:39 +08:00
Fangjun Kuang
cbf8c18ebd
Minor fixes for aishell ( #218 )
...
* Minor fixes to aishell.
* Minor fixes.
2022-02-19 22:28:19 +08:00