Daniel Povey
b6ee698278
Make LR update period less frequent later in training; fix bug with param_cov freshness, was too fresh
2022-07-15 07:59:30 +08:00
Yuekai Zhang
c17233eca7
[Ready] [Recipes] add aishell2 ( #465 )
...
* add aishell2
* fix aishell2
* add manifest stats
* update prepare char dict
* fix lint
* setting max duration
* lint
* change context size to 1
* update result
* update hf link
* fix decoding comment
* add more decoding methods
* update result
* change context-size 2 default
2022-07-14 14:46:56 +08:00
Daniel Povey
689441b237
Reduce param_pow from 0.75 to 0.5
2022-07-14 06:08:06 +08:00
Daniel Povey
7f6fe02db9
Fix formula for smoothing (was applying more smoothing than intended, and in the opposite sense to intended), also revert max_rms from 2.0 to 4.0
2022-07-14 06:06:02 +08:00
LIyong.Guo
f8d28f0998
update multi_quantization installation ( #469 )
...
* update multi_quantization installation
* Update egs/librispeech/ASR/pruned_transducer_stateless6/train.py
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-07-13 21:16:45 +08:00
Daniel Povey
4785245e5c
Reduce debug freq
2022-07-13 06:51:23 +08:00
Daniel Povey
d48fe0b99c
Change max rms from 10.0 to 4.0
2022-07-13 05:53:35 +08:00
Zengwei Yao
bc2882ddcc
Simplified memory bank for Emformer ( #440 )
...
* init files
* use average value as memory vector for each chunk
* change tail padding length from right_context_length to chunk_length
* correct the files, ln -> cp
* fix bug in conv_emformer_transducer_stateless2/emformer.py
* fix doc in conv_emformer_transducer_stateless/emformer.py
* refactor init states for stream
* modify .flake8
* fix bug about memory mask when memory_size==0
* add @torch.jit.export for init_states function
* update RESULTS.md
* minor change
* update README.md
* modify doc
* replace torch.div() with <<
* fix bug, >> -> <<
* use i&i-1 to judge if it is a power of 2
* minor fix
* fix error in RESULTS.md
2022-07-12 19:19:58 +08:00
Daniel Povey
cedfb5a377
Make max eig ratio 10
2022-07-12 13:59:58 +08:00
Daniel Povey
278358bb9f
Remove debug code
2022-07-12 08:39:14 +08:00
Daniel Povey
8c44ff26f7
Fix bug in batching code for scalars
2022-07-12 08:36:45 +08:00
Daniel Povey
25cb8308d5
Add max_block_size=512 to PrAdam
2022-07-12 08:35:14 +08:00
Daniel Povey
41df045773
Simplify formula, getting rid of scalar_exp_avg_sq
2022-07-11 17:14:12 -07:00
Daniel Povey
4f0e219523
Bug fix to reproduce past results with max_block_size unset.
2022-07-11 17:03:32 -07:00
Daniel Povey
075a2e27d8
Replace max_fullcov_size with max_block_size
2022-07-11 16:37:01 -07:00
Daniel Povey
3468c3aa5a
Remove ActivationBalancer, unnecessary
2022-07-11 14:12:24 -07:00
Daniel Povey
7993c84cd6
Apparently working version, with changed test-code topology
2022-07-11 13:17:29 -07:00
Zengwei Yao
ce26495238
Rand combine update result ( #467 )
...
* update RESULTS.md
* fix test code in pruned_transducer_stateless5/conformer.py
* minor fix
* delete doc
* fix style
2022-07-11 18:13:31 +08:00
Daniel Povey
245d39b1bb
Still debugging but close to done
2022-07-11 00:33:37 -07:00
Daniel Povey
27da50a1f6
Committing partial work..
2022-07-10 15:46:32 -07:00
Daniel Povey
d25df4af5e
Slight refactoring, preparing for batching.
2022-07-09 22:24:36 -07:00
Daniel Povey
d9a6180ae0
Bug fix
2022-07-10 10:20:39 +08:00
Daniel Povey
b7035844a2
Introduce scalar_max, stop eps getting large or small
2022-07-10 10:13:55 +08:00
Daniel Povey
2f73434541
Reduce debug frequency
2022-07-10 06:44:50 +08:00
Daniel Povey
b3bb2dac6f
Iterative, more principled way of estimating param_cov
2022-07-10 06:28:01 +08:00
Daniel Povey
d139c18f22
Max eig of Q limited to 5 times the mean
2022-07-09 14:30:03 +08:00
Daniel Povey
ffeef4ede4
Remove rank-1 dims, meaning where size==numel(), from processing.
2022-07-09 13:36:48 +08:00
Daniel Povey
2fc9eb9789
Respect param_pow
2022-07-09 12:49:04 +08:00
Daniel Povey
209acaf6e4
Increase lr_update_period to 200. The update takes about 2 minutes, fore entire model.
2022-07-09 11:36:54 +08:00
Daniel Povey
61cab3ab65
introduce grad_cov_period
2022-07-09 10:29:23 +08:00
Daniel Povey
35a51bc153
Reduce debug probs
2022-07-09 10:22:19 +08:00
Daniel Povey
65bc964854
Fix bug for scalar update
2022-07-09 10:14:20 +08:00
Daniel Povey
aa2237a793
Bug fix
2022-07-09 10:11:54 +08:00
Daniel Povey
50ee414486
Fix train.py for new optimizer
2022-07-09 10:09:53 +08:00
Daniel Povey
6810849058
Implement new version of learning method. Does more complete diagonalization of grads than the previous methods.
2022-07-09 10:02:17 +08:00
Daniel Povey
a9edecd32c
Conformed that symmetrizing helps because of interaction with regular update; still meta_lr_scale=0 best :-(
2022-07-09 05:20:04 +08:00
Fangjun Kuang
6c69c4e253
Support running icefall outside of a git tracked directory. ( #470 )
...
* Support running icefall outside of a git tracked directory.
* Minor fixes.
2022-07-08 15:03:07 +08:00
Daniel Povey
52bfb2b018
This works better for reasons I dont understand. transpose is enough, same as symmetrizing.
2022-07-08 11:53:59 +08:00
Daniel Povey
e9ab1ddd39
Inconseqeuential config change
2022-07-08 11:03:16 +08:00
Daniel Povey
be6680e3ba
Couple configuration changes, comment simplification
2022-07-08 09:46:42 +08:00
Fangjun Kuang
e5fdbcd480
Revert changes to setup_logger. ( #468 )
2022-07-08 09:15:37 +08:00
Daniel Povey
75e872ea57
Fix bug in getting denom in proj update
2022-07-08 09:13:54 +08:00
Daniel Povey
914ac1e621
Works better with meta_lr_scale=0, must be bug.
2022-07-08 09:07:06 +08:00
Daniel Povey
923468b8af
Deal with SVD failure better.
2022-07-08 09:00:12 +08:00
Daniel Povey
97feb8a3ec
Reduce meta_lr_scale, reduces loss @140 from 1.4 to 0.39
2022-07-08 06:33:07 +08:00
Daniel Povey
b6199a71e9
Introduce delta_scale to slow down changes on M; significantly better.
2022-07-08 06:05:31 +08:00
Daniel Povey
ceb9815f2b
Increase lr_est_period
2022-07-08 05:51:18 +08:00
Daniel Povey
fb36712e6b
Another bug fix, regarding Q being transposed.
2022-07-08 05:22:24 +08:00
Daniel Povey
ad2e698fc3
Cleanups
2022-07-08 04:44:21 +08:00
Daniel Povey
04d2e10b4f
Version that runs
2022-07-08 04:37:46 +08:00