mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-12-11 06:55:27 +00:00
Reduce the limit on attention weights from 50 to 25.
This commit is contained in:
parent
c5cb52fed1
commit
9f68b5717c
@ -1116,7 +1116,7 @@ class RelPositionMultiheadAttention(nn.Module):
|
|||||||
# this mechanism instead of, say, a limit on entropy, because once the entropy
|
# this mechanism instead of, say, a limit on entropy, because once the entropy
|
||||||
# gets very small gradients through the softmax can become very small, and
|
# gets very small gradients through the softmax can become very small, and
|
||||||
# some mechanisms like that become ineffective.
|
# some mechanisms like that become ineffective.
|
||||||
attn_weights_limit = 50.0
|
attn_weights_limit = 25.0
|
||||||
# caution: this penalty will be affected by grad-scaling in amp.
|
# caution: this penalty will be affected by grad-scaling in amp.
|
||||||
# It's OK; this is just an emergency brake, and under normal
|
# It's OK; this is just an emergency brake, and under normal
|
||||||
# conditions it shouldn't be active
|
# conditions it shouldn't be active
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user