# Image Feature Representations
## Pixel representations
![[pixel-representation.jpg]]
Issues
- Similarity: Raw pixels do not reflect semantics
- Translation: Pixel values are not translation invariant
## Global representations
![[imagehist.jpg]]
Issues
- Throws away all local information.
- Useful only for simple and high level tasks.
## Local features
![[localfeatures.jpg]]
- [[Harris Corner Detection]] / [[Scale-Invariant Feature Transform (SIFT)]]
- [[HoG Feature Descriptor]]
- [[Convolutional Neural Networks (CNN)]] learn successive levels of local features
Desired properties
- Scale invariance
- Rotation Invariance - [[Group Equivariant Convolutional Neural Networks]]
- Translation invariance - [[Convolutional Neural Networks (CNN)]]
- Lighting Robust
- Projection Robust
### Bag of visual words
Local features can be used to construct a "Bag of visual words" for downstream tasks.
![[visual-bow.jpg]]
Steps:
1. Patch Sampling - [[Harris Corner Detection]] / [[Scale-Invariant Feature Transform (SIFT)]]
2. Visual Dictionary ([[Clustering]])
3. Histogram creation
---
## References