Polyloss - Notes on AI

# Polyloss PolyLoss provides a framework for understanding and improving the commonly used [[Cross entropy]] loss and [[Focal Loss]]. It is inspired from the Taylor expansion of cross-entropy loss and focal loss. $ \begin{gathered} L_{\mathrm{CE}}=-\log \left(P_t\right)=\sum_{j=1} 1 / j\left(1-P_t\right)^j=\left(1-P_t\right)+1 / 2\left(1-P_t\right)^2 \ldots \\ L_{\mathrm{FL}}=-\left(1-P_t\right)^\gamma \log \left(P_t\right)=\sum_{j=1}^{\infty} 1 / j\left(1-P_t\right)^{j+\gamma}=\left(1-P_t\right)^{1+\gamma}+1 / 2\left(1-P_t\right)^{2+\gamma} \ldots \end{gathered} $ They find tuning the first polynomial term leads to the most significant gain, leading to Poly-1 loss: $ L_{\text {Poly-1 }}=\left(1+\epsilon_1\right)\left(1-P_t\right)+1 / 2\left(1-P_t\right)^2+\ldots=-\log \left(P_t\right)+\epsilon_1\left(1-P_t\right) $ Experimental observations: - Poly-1 loss improves 2D image classification on ImageNet - Poly-1 loss improves 2D instance segmentation and object detection on COCO - Generally increases prediction confidence and lowers overconfident predictions. --- ## References 1. PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions, Leng et al. 2022 https://arxiv.org/abs/2204.12511