# Similarity Measures ## Cosine similarity Cosine similarity is the dot product of two vectors, normalized by the product of the length of the vectors. It calculates the angle between two vectors and is therefore length-independent. Its values ranges from 0 to 1. $ \cos (\theta)=\frac{\sum v 1_{k} * v 2_{k}}{\sqrt{\sum v 1_{k}^{2}} * \sqrt{\sum v 2_{k}^{2}}} $ ## Jaccard Similarity Jaccard similarity measures how similar two sets are, defined as the ratio of their intersection to their union: $ J(A, B)=\frac{|A \cap B|}{|A \cup B|} $ It ranges from 0 (no overlap) to 1 (identical sets). It's commonly used to compare document similarity, recommendation systems, and clustering applications where you need to measure overlap between categorical or binary features. ## KL Divergence Distance - [[Jensen–Shannon Divergence]] ## Optimal Transport Based - [[Wasserstein Distance]] --- ## References