mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-09 10:02:22 +00:00
docs: minor fixes of LM rescoring texts (#1498)
This commit is contained in:
parent
b3e2044068
commit
e59fa38e86
@ -30,7 +30,7 @@ of langugae model integration.
|
|||||||
First, let's have a look at some background information. As the predecessor of LODR, Density Ratio (DR) is first proposed `here <https://arxiv.org/abs/2002.11268>`_
|
First, let's have a look at some background information. As the predecessor of LODR, Density Ratio (DR) is first proposed `here <https://arxiv.org/abs/2002.11268>`_
|
||||||
to address the language information mismatch between the training
|
to address the language information mismatch between the training
|
||||||
corpus (source domain) and the testing corpus (target domain). Assuming that the source domain and the test domain
|
corpus (source domain) and the testing corpus (target domain). Assuming that the source domain and the test domain
|
||||||
are acoustically similar, DR derives the following formular for decoding with Bayes' theorem:
|
are acoustically similar, DR derives the following formula for decoding with Bayes' theorem:
|
||||||
|
|
||||||
.. math::
|
.. math::
|
||||||
|
|
||||||
@ -41,7 +41,7 @@ are acoustically similar, DR derives the following formular for decoding with Ba
|
|||||||
|
|
||||||
|
|
||||||
where :math:`\lambda_1` and :math:`\lambda_2` are the weights of LM scores for target domain and source domain respectively.
|
where :math:`\lambda_1` and :math:`\lambda_2` are the weights of LM scores for target domain and source domain respectively.
|
||||||
Here, the source domain LM is trained on the training corpus. The only difference in the above formular compared to
|
Here, the source domain LM is trained on the training corpus. The only difference in the above formula compared to
|
||||||
shallow fusion is the subtraction of the source domain LM.
|
shallow fusion is the subtraction of the source domain LM.
|
||||||
|
|
||||||
Some works treat the predictor and the joiner of the neural transducer as its internal LM. However, the LM is
|
Some works treat the predictor and the joiner of the neural transducer as its internal LM. However, the LM is
|
||||||
@ -58,7 +58,7 @@ during decoding for transducer model:
|
|||||||
|
|
||||||
In LODR, an additional bi-gram LM estimated on the source domain (e.g training corpus) is required. Compared to DR,
|
In LODR, an additional bi-gram LM estimated on the source domain (e.g training corpus) is required. Compared to DR,
|
||||||
the only difference lies in the choice of source domain LM. According to the original `paper <https://arxiv.org/abs/2203.16776>`_,
|
the only difference lies in the choice of source domain LM. According to the original `paper <https://arxiv.org/abs/2203.16776>`_,
|
||||||
LODR achieves similar performance compared DR in both intra-domain and cross-domain settings.
|
LODR achieves similar performance compared to DR in both intra-domain and cross-domain settings.
|
||||||
As a bi-gram is much faster to evaluate, LODR is usually much faster.
|
As a bi-gram is much faster to evaluate, LODR is usually much faster.
|
||||||
|
|
||||||
Now, we will show you how to use LODR in ``icefall``.
|
Now, we will show you how to use LODR in ``icefall``.
|
||||||
|
@ -139,7 +139,7 @@ A few parameters can be tuned to further boost the performance of shallow fusion
|
|||||||
- ``--lm-scale``
|
- ``--lm-scale``
|
||||||
|
|
||||||
Controls the scale of the LM. If too small, the external language model may not be fully utilized; if too large,
|
Controls the scale of the LM. If too small, the external language model may not be fully utilized; if too large,
|
||||||
the LM score may dominant during decoding, leading to bad WER. A typical value of this is around 0.3.
|
the LM score might be dominant during decoding, leading to bad WER. A typical value of this is around 0.3.
|
||||||
|
|
||||||
- ``--beam-size``
|
- ``--beam-size``
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user