Daniel Povey
46ca1cd4c4
Add Decorrelate module that adds something to gradients in backward pass
2022-06-08 19:44:58 +08:00
Daniel Povey
9fb8645168
Implement JoinDropout
2022-06-08 16:17:42 +08:00
Daniel Povey
e7886d49a9
Bug fix
2022-06-08 11:05:29 +08:00
Daniel Povey
a83bde1372
Simplify implementation as current idea was not working to decorrelate
2022-06-08 10:24:41 +08:00
Daniel Povey
135be1e19c
Change dropout_rate from 0.2 to 0.1; fix logging statement; fix assignment to rand_scales, nonrand_scales to use [:]
2022-06-08 00:42:04 +08:00
Daniel Povey
a6050cb2de
Implement new, more principled but maybe slower version.
2022-06-07 23:38:38 +08:00
Daniel Povey
75c822c7e9
Pre and post-multiply by inv_sqrt_stddev,stddev
2022-06-07 20:32:18 +08:00
Daniel Povey
a270973b69
Add gaussian version of decorrelation
2022-06-07 18:55:48 +08:00
Daniel Povey
5d24489752
Have 2 scales on dropout
2022-06-07 18:31:42 +08:00
Daniel Povey
53ca61db7a
Reduce scale on decorrelation by 5, to 0.01
2022-06-07 17:10:54 +08:00
Daniel Povey
7c6d923d3f
Add decorrelation to joiner
2022-06-07 16:47:54 +08:00
Daniel Povey
cd6b707e2b
Various bug fixes
2022-06-07 16:45:32 +08:00
Daniel Povey
40a0934b4e
Implement GaussProjDrop
2022-06-07 11:51:24 +08:00
Daniel Povey
4352a16f57
Fix bug that relates to modifying U in place
2022-06-06 17:43:15 +08:00
Daniel Povey
31848dcd11
Randomize the projections
2022-06-06 16:07:28 +08:00
Daniel Povey
6fdb356315
Bug fix RE GPU device
2022-06-06 15:40:20 +08:00
Daniel Povey
71e927411a
Implement FixedProjDrop
2022-06-06 15:38:59 +08:00
Daniel Povey
28df3ba43f
Fix bug re half precision
2022-06-05 23:26:59 +08:00
Daniel Povey
d76aedb790
Make it work for half
2022-06-05 23:25:51 +08:00
Daniel Povey
e535887abb
Bug fixes.
2022-06-05 23:24:02 +08:00
Daniel Povey
136ffb0597
Add ProjDrop for axis-independent dropout
2022-06-05 23:00:48 +08:00
Daniel Povey
a1ae2f8fa9
Revert some accidental changes
2022-06-05 11:40:55 +08:00
fanlu
8a3068ead8
Update decode.py ( #392 )
...
* Update decode.py
fix bug ```TypeError: greedy_search_batch() missing 1 required positional argument: 'encoder_out_lens'```
* fix modified_beam_search
Co-authored-by: fanlu3 <fanlu@jd.com>
2022-06-04 19:08:17 +08:00
Zengwei Yao
148f69d8d9
Update RESULTS.md ( #388 )
...
* update RESULT.md about pruned_transducer_stateless4
* Update RESULT.md
This PR is only to update RESULT.md about pruned_transducer_stateless4.
* set default value of --use-averaged-model to True
* update RESULTS.md and add decode command
* minor fix
* update export.py
* add uploaded files links
* update link
* fix typos
2022-06-04 15:52:35 +08:00
Daniel Povey
a9a172aa69
Multiply lr by 10; simplify Cain.
2022-06-04 15:48:33 +08:00
Mingshuang Luo
beab229fd7
[Ready to merge] Pruned_transducer_stateless2 for alimeeting dataset ( #378 )
...
* add pruned-rnnt2 recipe for alimeeting dataset
* update code for merging
* change LilcomHdf5Writer to ChunkedLilcomHdf5Writer
* change for test.yml
* change for test.yml
* change for test.yml
* change for workflow yml
* change for yml
* change for yml
* change for README.md
* change for yml
* solve the conflicts
* solve the conflicts
2022-06-04 13:47:46 +08:00
Daniel Povey
679972b905
Fix bug; make epsilon work both ways (small+large); increase epsilon to 0.1
2022-06-03 19:37:48 +08:00
Daniel Povey
8085ed6ef9
Turn off natural gradient update for biases.
2022-06-03 18:40:14 +08:00
Daniel Povey
3fff0c75bb
Code cleanup
2022-06-03 11:54:12 +08:00
Daniel Povey
d6e65a0e7f
Remove decompose=True
2022-06-03 11:48:45 +08:00
Daniel Povey
a66a0d84d5
Natural gradient, with power -0.5 (halfway; -1 would be NG)
2022-06-02 14:01:03 +08:00
Daniel Povey
b1f6797af1
Remove some rebalancing code that I am now not going to use.
2022-06-01 22:19:28 +08:00
Daniel Povey
0c73664aef
Reduce threshold to 1024
2022-06-01 14:42:56 +08:00
Fangjun Kuang
fbfc98f1d3
Add streaming Emformer stateless RNN-T. ( #390 )
...
* Add streaming Emformer stateless RNN-T.
* Update results for streaming Emformer.
* Minor fixes.
2022-06-01 14:31:47 +08:00
Daniel Povey
ca09b9798f
Remove decomposition code from checkpoint.py; restore double precision model_avg
2022-06-01 14:01:58 +08:00
Daniel Povey
03e07e80ce
More drafts for rebalancing code
2022-06-01 13:58:42 +08:00
Daniel Povey
9c9bf4f1e3
Some drafts of rebalancing code in optim.py
2022-06-01 11:34:19 +08:00
Daniel Povey
bc5c782294
Limit magnitude of linear_pos
2022-06-01 10:40:54 +08:00
Daniel Povey
61619c031e
Add activation balancer to stop activations in self_attn from getting too large
2022-06-01 00:40:45 +08:00
Daniel Povey
b2259184b5
Use single precision for model average; increase average-period to 200.
2022-05-31 14:31:46 +08:00
Daniel Povey
ab9eb0d52c
Use decompose=True arg for model averaging
2022-05-31 14:28:53 +08:00
Daniel Povey
1651fe0d42
Merge changes from pruned_transducer_stateless4->5
2022-05-31 13:00:11 +08:00
Daniel Povey
c7cf229f56
Revers pruned_transducer_stateless4 to upstream/master
2022-05-31 12:45:51 +08:00
Daniel Povey
741dcd1d6d
Move pruned_transducer_stateless4 to pruned_transducer_stateless7
2022-05-31 12:45:28 +08:00
Daniel Povey
8f877efec5
Remove pruned_transducer_stateless4b
2022-05-31 12:29:45 +08:00
Daniel Povey
7011956c6c
Merge remote-tracking branch 'upstream/master' into cain3d_clean_merge
2022-05-31 12:17:45 +08:00
Daniel Povey
c3df609805
Revert lrate changes
2022-05-30 16:24:40 +08:00
Daniel Povey
b01c09a693
Remove the natural gradient stuff while keeping cosmetic changes.
2022-05-30 11:56:11 +08:00
Daniel Povey
8a96f29a11
Further increase learning rate
2022-05-29 11:13:27 +08:00
Daniel Povey
2b8ea98fc2
Improve documentation; remove unused code.
2022-05-28 19:22:09 +08:00