Visual Tracking - Notes on AI

# Visual Tracking ## Template based tracking Tracking consists of searching for the target object in a frame by comparing with a _template_ image. We assume that the template is fixed and given in advance. ![[template-tracking.jpg]] The template is mapped into a candidate target region the image using a transformation of coordinates: $\varphi(\mathbf{x})$. This transformation depends on a parameter vector $\mathrm{y}$. Different candidate regions correspond to different values of $\mathbf{y}$. So we write $\varphi(\mathbf{x} ; \mathbf{y})$. Align the template with every possible candidate region in the image, and find the most similar candidate according to a similarity measure. The similarity measure can be based on: - pixelwise intensity (color) difference: SSD: $D(\mathbf{y})=\sum_{\mathbf{x} \in \Omega}[I(\mathbf{x}+\mathbf{y})-T(\mathbf{x})]^{2} \rightarrow \min _{\mathbf{y}}$ and correlation tracker: $C(\mathbf{y})=\sum_{\mathbf{x} \in \Omega} I(\mathbf{x}+\mathbf{y}) T(\mathbf{x}) \rightarrow \max _{\mathbf{y}}$ - histogram difference: mean-shift tracker. We search the target only in an area around the previous position exploiting general knowledge that the object won't have moved far. Strengths: robustness and simplicity in implementation. Weaknesses: - Computations could be time-consuming in case of a large search window. - Only suitable for translation. ## Mean-shift Tracking Target model: $\quad \vec{q}=\left(q_{1}, \ldots, q_{m}\right)$ Target candidate: $\quad \vec{p}(y)=\left(p_{1}(y), \ldots, p_{m}(y)\right)$ Similarity function: $f(y)=f[\vec{p}(y), \vec{q}]=?$ ### The Bhattacharyya Coefficient $\vec{q}^{\prime}=\left(\sqrt{q_{1}}, \ldots, \sqrt{q_{m}}\right)$ $\vec{p}^{\prime}(y)=\left(\sqrt{p_{1}(y)}, \ldots, \sqrt{p_{m}(y)}\right)$ $f(y)=\cos \theta_{y}=\frac{p^{\prime}(y)^{T} q^{\prime}}{\left\|p^{\prime}(y)\right\| \cdot\left\|q^{\prime}\right\|}=\sum_{u=1}^{m} \sqrt{p_{u}(y) q_{u}}$ ![[meanshift-traking.jpg]] ## Tracking by Detection Recent tracking work: - Focus on appearance model - Borrow techniques from obj. detection - Slide a discriminative classifier around image - Adaptive appearance model Boris Babenko, Ming-Hsuan Yang, Serge Belongie, CVPR09 First frame is label, then use classifier to classify object in the next frames. --- ## References