# Pointwise Mutual Information (PMI) $ \begin{array}{c} P M I(w, c)=\log \frac{P(w, c)}{P(w) P(c)}=\log \frac{P(w) P(c \mid w)}{P(w) P(c)}=\log \frac{P(c \mid w)}{P(c)} \\ P(c)=\frac{f(c)}{\sum_{k} f\left(c_{k}\right)}, \quad P(c \mid w)=\frac{f(w, c)}{f(w)} \\ P M I(w, c)=\log \frac{f(w, c) \sum_{k} f\left(c_{k}\right)}{f(w) f(c)} \end{array} $ $f(w, c):$ frequency of word $w$ in context $c$ $f(w):$ frequency of word $w$ in all contexts $f(c):$ frequency of context $c$ PMI is an alternative to [[TF-IDF]]. PMI has the problem of being biased toward infrequent events; very rare words tend to have very high PMI values. One way to reduce this bias toward low frequency events is to slightly change the computation for P(c), using a different function Pα (c) that raises the probability of the context word to the power of α: $ P_{\alpha}(c)=\frac{\operatorname{count}(c)^{\alpha}}{\sum_{c} \operatorname{count}(c)^{\alpha}} $ --- ## References