ListNet and ListMLE - Notes on AI

# ListNet and ListMLE Create a probabilistic model for ranking, which is differentiable. Sample documents from a Plackett-Luce distribution: $ P\left(d_{i}\right)=\frac{\phi\left(s_{i}\right)}{\sum_{d_{j} \in D} \phi\left(s_{j}\right)} $ For instance, $\phi\left(s_{i}\right)=e^{s_{i}}$ $ P\left(d_{i}\right)=\frac{e^{s_{i}}}{\sum_{d_{j} \in D} e^{s_{j}}} $ According to the Luce model [Luce, 2005], given four items $\left\{d_{1}, d_{2}, d_{3}, d_{4}\right\}$ the probability of observing a particular rank-order, say $\left[d_{2}, d_{1}, d_{4}, d_{3}\right],$ is given by: $ P(\pi \mid s)=\frac{\phi\left(s_{2}\right)}{\phi\left(s_{1}\right)+\phi\left(s_{2}\right)+\phi\left(s_{3}\right)+\phi\left(s_{4}\right)} \cdot \frac{\phi\left(s_{1}\right)}{\phi\left(s_{1}\right)+\phi\left(s_{3}\right)+\phi\left(s_{4}\right)} \cdot \frac{\phi\left(s_{4}\right)}{\phi\left(s_{3}\right)+\phi\left(s_{4}\right)} $ where, $\pi$ is a particular permutation and $\phi$ is a transformation (e.g., linear, exponential, or sigmoid) over the score $s_{i}$ corresponding to item $d_{i}$. ListNet [Cao et al., 2007] - Compute the probability distribution over all possible permutations based on model score and ground-truth labels. The loss is then given by the K-L divergence between these two distributions. - This is computationally very costly, computing permutations of only the top-K items makes it slightly less prohibitive. ListMLE [Xia et al., 2008] - Compute the probability of the ideal permutation based on the ground truth. However, with categorical labels more than one permutation is possible which makes this difficult. --- ## References