# Polyloss
PolyLoss provides a framework for understanding and improving the commonly used [[Cross entropy]] loss and focal loss. It is inspired from the Taylor expansion of cross-entropy loss and focal loss.
$
\begin{gathered}
L_{\mathrm{CE}}=-\log \left(P_t\right)=\sum_{j=1} 1 / j\left(1-P_t\right)^j=\left(1-P_t\right)+1 / 2\left(1-P_t\right)^2 \ldots \\
L_{\mathrm{FL}}=-\left(1-P_t\right)^\gamma \log \left(P_t\right)=\sum_{j=1}^{\infty} 1 / j\left(1-P_t\right)^{j+\gamma}=\left(1-P_t\right)^{1+\gamma}+1 / 2\left(1-P_t\right)^{2+\gamma} \ldots
\end{gathered}
$
They find tuning the first polynomial term leads to the most significant gain, leading to Poly-1 loss:
$
L_{\text {Poly-1 }}=\left(1+\epsilon_1\right)\left(1-P_t\right)+1 / 2\left(1-P_t\right)^2+\ldots=-\log \left(P_t\right)+\epsilon_1\left(1-P_t\right)
$
Experimental observations:
- Poly-1 loss improves 2D image classification on ImageNet
- Poly-1 loss improves 2D instance segmentation and object detection on COCO
- Generally increases prediction confidence and lowers overconfident predictions.
---
## References
1. PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions, Leng et al. 2022 https://arxiv.org/abs/2204.12511