# Pointwise Mutual Information (PMI)
$
\begin{array}{c}
P M I(w, c)=\log \frac{P(w, c)}{P(w) P(c)}=\log \frac{P(w) P(c \mid w)}{P(w) P(c)}=\log \frac{P(c \mid w)}{P(c)} \\
P(c)=\frac{f(c)}{\sum_{k} f\left(c_{k}\right)}, \quad P(c \mid w)=\frac{f(w, c)}{f(w)} \\
P M I(w, c)=\log \frac{f(w, c) \sum_{k} f\left(c_{k}\right)}{f(w) f(c)}
\end{array}
$
$f(w, c):$ frequency of word $w$ in context $c$
$f(w):$ frequency of word $w$ in all contexts
$f(c):$ frequency of context $c$
PMI is an alternative to [[TF-IDF]].
PMI has the problem of being biased toward infrequent events; very rare words tend to have very high PMI values. One way to reduce this bias toward low frequency events is to slightly change the computation for P(c), using a different function Pα (c) that raises the probability of the context word to the power of α:
$
P_{\alpha}(c)=\frac{\operatorname{count}(c)^{\alpha}}{\sum_{c} \operatorname{count}(c)^{\alpha}}
$
---
## References