mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-26 18:24:18 +00:00
Add more documentation.
This commit is contained in:
parent
d680b56c5c
commit
e3d7f21372
@ -17,23 +17,42 @@
|
|||||||
"""
|
"""
|
||||||
This file contains rescoring code for NN LMs, e.g., conformer LM.
|
This file contains rescoring code for NN LMs, e.g., conformer LM.
|
||||||
|
|
||||||
Here are the ideas about preparing the inputs for the conformer LM model
|
Support an utterance has 3 paths:
|
||||||
from an Nbest object.
|
(a, b, c)
|
||||||
|
and we want to use a masked conformer LM to assign a likelihood to each path.
|
||||||
|
|
||||||
Given an Nbest object `nbest`, we have:
|
The following shows the steps:
|
||||||
- nbest.fsa
|
|
||||||
- nbest.shape, whose axes are [utt][path]
|
|
||||||
|
|
||||||
We can get `tokens` from nbest.fsa. The resulting `tokens` will have
|
(1) Select path pairs:
|
||||||
2 axes [path][token]. Note, we should remove 0s from `tokens`.
|
(a, b), (a, c)
|
||||||
|
(b, a), (b, c)
|
||||||
|
(c, a), (c, b)
|
||||||
|
|
||||||
We can generate the following inputs for the conformer LM model from `tokens`:
|
(2) For each pair, e.g., for the pair (a, b),
|
||||||
- masked_src
|
|
||||||
- src
|
|
||||||
- tgt
|
|
||||||
by using `k2.levenshtein_alignment`.
|
|
||||||
|
|
||||||
TODO(fangjun): Add more doc about rescoring with masked conformer-lm.
|
(i) Compute the alignment between "a" and "b"
|
||||||
|
(ii) Use the computed alignment as `masked_src,`
|
||||||
|
(iii) Use "a" as "src" and its shifted version as "tgt" (of course, we need
|
||||||
|
to add bos and eos) and we can get a log-likelihood value (after
|
||||||
|
negating the negative log-likelihood). Let us call this value as
|
||||||
|
"ab_self"
|
||||||
|
(iv) Use "b" as "src" and its shifted version as "tgt".
|
||||||
|
We can get another likelihood value, denoted as "ab_other"
|
||||||
|
|
||||||
|
So for the path pair (a, b), (a, c), we can get the following log-likelihood
|
||||||
|
values, viewed as two tensors:
|
||||||
|
|
||||||
|
self = [ab_self, ac_self, ba_self, bc_self, ca_self, cb_self]
|
||||||
|
|
||||||
|
other = [ab_other, ac_other, ba_other, bc_other, ca_other, cb_other]
|
||||||
|
|
||||||
|
Compute the difference the two tensors:
|
||||||
|
|
||||||
|
self - other = [ab_self - ab_other, ac_self - ac_other, ...]
|
||||||
|
|
||||||
|
The log-likelihood for path a is : max(ab_self - ab_other, ac_self - ac_other)
|
||||||
|
The log-likelihood for path b is : max(ba_self - ba_other, bc_self - bc_other)
|
||||||
|
The log-likelihood for path c is : max(ca_self - ca_other, cb_self - cb_other)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from typing import Tuple
|
from typing import Tuple
|
||||||
|
Loading…
x
Reference in New Issue
Block a user