# Image Feature Representations ## Pixel representations ![[pixel-representation.jpg]] Issues - Similarity: Raw pixels do not reflect semantics - Translation: Pixel values are not translation invariant ## Global representations ![[imagehist.jpg]] Issues - Throws away all local information. - Useful only for simple and high level tasks. ## Local features ![[localfeatures.jpg]] - [[Harris Corner Detection]] / [[Scale-Invariant Feature Transform (SIFT)]] - [[HoG Feature Descriptor]] - [[Convolutional Neural Networks (CNN)]] learn successive levels of local features Desired properties - Scale invariance - Rotation Invariance - [[Group Equivariant Convolutional Neural Networks]] - Translation invariance - [[Convolutional Neural Networks (CNN)]] - Lighting Robust - Projection Robust ### Bag of visual words Local features can be used to construct a "Bag of visual words" for downstream tasks. ![[visual-bow.jpg]] Steps: 1. Patch Sampling - [[Harris Corner Detection]] / [[Scale-Invariant Feature Transform (SIFT)]] 2. Visual Dictionary ([[Clustering]]) 3. Histogram creation --- ## References