Fangjun Kuang
ec69967584
Set overwrite=True when extracting features in batches. ( #487 )
2022-07-29 11:17:19 +08:00
Daniel Povey
9d7af4be20
Modify scaling.py to prevent constant values
2022-07-29 09:34:13 +08:00
Daniel Povey
3c1fddaf48
Rework computation to reduce numerical roundoff
2022-07-29 06:22:17 +08:00
Mingshuang Luo
389f9c77e5
correction for prepare.sh ( #506 )
2022-07-28 17:01:46 +08:00
boji123
3c9e7f733b
[debug] raise remind when git-lfs not available ( #504 )
...
* [debug] raise remind when git-lfs not available
* modify comment
2022-07-28 16:17:49 +08:00
Daniel Povey
633cbd551a
Increase lr_update_period from 200,4000 to 400, 5000
2022-07-28 14:45:45 +08:00
Mingshuang Luo
f26b62ac00
[WIP] Pruned-transducer-stateless5-for-WenetSpeech (offline and streaming) ( #447 )
...
* pruned-rnnt5-for-wenetspeech
* style check
* style check
* add streaming conformer
* add streaming decode
* changes codes for fast_beam_search and export cpu jit
* add modified-beam-search for streaming decoding
* add modified-beam-search for streaming decoding
* change for streaming_beam_search.py
* add README.md and RESULTS.md
* change for style_check.yml
* do some changes
* do some changes for export.py
* add some decode commands for usage
* add streaming results on README.md
2022-07-28 12:54:27 +08:00
Daniel Povey
0d038a6ea4
Remove debugging statement
2022-07-28 09:26:11 +08:00
Daniel Povey
8654a7385d
Add denom_rel_eps, and set it to 1e-05
2022-07-28 09:10:20 +08:00
Daniel Povey
dc565f729b
Take into account various outcomes from parameter tuning
2022-07-28 09:06:59 +08:00
Daniel Povey
daa55d5a3c
Patches to make decoding work correctly at utt start, for greedy_search
2022-07-27 09:35:39 +08:00
Fangjun Kuang
385645d533
Fix get_transducer_model() for aishell. ( #497 )
...
PR #495 introduces an error. This commit fixes it.
2022-07-26 15:42:21 +08:00
Daniel Povey
e25ca74955
Use a measure of correlation for eigs that can be negative.
2022-07-26 13:40:57 +08:00
Daniel Povey
b9696878b4
Update diagnostics stats
2022-07-26 12:39:51 +08:00
Fangjun Kuang
d3fc4b031e
Support using aidatatang_200zh optionally in aishell training ( #495 )
...
* Use aidatatang_200zh optionally in aishell training.
2022-07-26 11:25:01 +08:00
Fangjun Kuang
4612b03947
Fix using G before assignment in pruned_transducer_stateless/decode.py ( #494 )
2022-07-26 10:37:02 +08:00
Wei Kang
b1d0956855
Add modified_beam_search for streaming decode ( #489 )
...
* Add modified_beam_search for pruned_transducer_stateless/streaming_decode.py
* refactor
* modified beam search for stateless3,4
* Fix comments
* Add real streamng ci
2022-07-25 16:53:23 +08:00
Zengwei Yao
8203d10be7
Add stats about duration and padding proportion ( #485 )
...
* add stats about duration and padding proportion
* add for utt_duration
* add stats for other recipes
* add stats for other 2 recipes
* modify doc
* minor change
2022-07-25 16:40:43 +08:00
Fangjun Kuang
d99796898c
Update doc to add a link to Nadira Povey's YouTube channel. ( #492 )
...
* Update doc to add a link to Nadira Povey's YouTube channel.
* fix a typo
2022-07-25 12:06:40 +08:00
Daniel Povey
fe595f8772
Improve debugging output.
2022-07-25 09:02:36 +08:00
Daniel Povey
854c2965a9
Fix bug regarding G_prime being zero
2022-07-25 06:57:52 +08:00
Daniel Povey
3acdf3b395
Reworking the computation of Z to be numerically better.
2022-07-25 06:37:26 +08:00
Daniel Povey
5513f7fee5
Initial version of fixing numerical issue, will continue though
2022-07-25 06:27:01 +08:00
Daniel Povey
b0f0c6c4ab
Setting lr_update_period=(200,4k) in train.py
2022-07-25 04:38:12 +08:00
Daniel Povey
06718052ec
Refactoring, putting tunable values in constructor, a little cleanup
2022-07-25 04:31:42 +08:00
Daniel Povey
8efc512823
Remove some debugging code, found the mismatch
2022-07-24 11:52:10 +08:00
Daniel Povey
ba96439c76
Saving version I am trying to debug
2022-07-24 11:00:40 +08:00
Daniel Povey
962e95f119
Using a more flexible test. Moved to simpler update , tuned diffrently.
2022-07-24 09:20:53 +08:00
Daniel Povey
b8a9485011
Print git version for test output
2022-07-24 06:54:29 +08:00
Daniel Povey
48ac7e0bc3
Add max as well as min to G_prime
2022-07-24 06:50:05 +08:00
Daniel Povey
6290fcb535
Cleanup and refactoring
2022-07-24 05:48:38 +08:00
Daniel Povey
8a9bbb93bc
Cosmetic fixes
2022-07-24 04:45:57 +08:00
Daniel Povey
966ac36cde
Fixes to comments
2022-07-24 04:36:41 +08:00
Daniel Povey
33ffd17515
Some cleanup
2022-07-24 04:22:11 +08:00
Daniel Povey
ddceb7963b
Interpolate between iterative estimate of scale, and original value.
2022-07-23 15:27:48 +08:00
Daniel Povey
2c4bdd0ad0
Add _update_param_scales_simple(), add documentation
2022-07-23 14:49:58 +08:00
Daniel Povey
9730352257
Redce smoothing constant slightly
2022-07-23 13:12:31 +08:00
Daniel Povey
e1873fc0bb
Tune phase2 again, from 0.005,5.0 to 0.01,40. Epoch 140 is 0.21/0.149
2022-07-23 10:10:01 +08:00
Daniel Povey
0fc58bac56
More tuning, epoch-140 results are 0.23,0.11
2022-07-23 09:52:51 +08:00
Daniel Povey
34a2d331bf
Smooth in opposite orientation to G
2022-07-23 09:38:16 +08:00
Daniel Povey
a972655a70
Tuning.
2022-07-23 09:15:49 +08:00
Daniel Povey
b47433b77a
Fix bug in smooth_cov, for power==1.0
2022-07-23 09:06:03 +08:00
Daniel Povey
cc388675a9
Bug fix RE rankj
2022-07-23 08:24:59 +08:00
Daniel Povey
dee496145d
this version performs way worse but has bugs fixed, can optimize from here.
2022-07-23 08:11:20 +08:00
Daniel Povey
dd10eb140f
First version after refactorization and changing the math, where optim.py runs
2022-07-23 06:32:56 +08:00
Quandwang
116d0cf26d
CTC attention model with reworked Conformer encoder and reworked Transformer decoder ( #462 )
...
* ctc attention model with reworked conformer encoder and reworked transformer decoder
* remove unnecessary func
* resolve flake8 conflicts
* fix typos and modify the expr of ScaledEmbedding
* use original beam size
* minor changes to the scripts
* add rnn lm decoding
* minor changes
* check whether q k v weight is None
* check whether q k v weight is None
* check whether q k v weight is None
* style correction
* update results
* update results
* upload the decoding results of rnn-lm to the RESULTS
* upload the decoding results of rnn-lm to the RESULTS
* Update egs/librispeech/ASR/RESULTS.md
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/librispeech/ASR/RESULTS.md
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update egs/librispeech/ASR/RESULTS.md
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-07-22 15:31:25 +08:00
Daniel Povey
4da4e69fba
Draft of new way of smoothing param_rms, diagonalized by grad
2022-07-22 06:37:20 +08:00
Mingshuang Luo
3d2986b4c2
Update conformer.py for aishell4 ( #484 )
...
* update conformer.py for aishell4
* update conformer.py
* add strict=False when model.load_state_dict
2022-07-20 21:32:53 +08:00
Daniel Povey
a8696b36fc
Merge pull request #483 from yaozengwei/fix_diagnostic
...
Fix diagnostic
2022-07-18 23:33:45 -07:00
yaozengwei
a35b28cd8d
fix for case of None stats
2022-07-19 14:29:23 +08:00