# Latent Semantic Analysis
Uses [[Singular Value Decomposition (SVD)]] to create representations of documents.
$
\begin{array}{c}
\left(\mathbf{t}_{i}^{T}\right) \rightarrow\left[\begin{array}{ccc}
x_{1,1} & \cdots & x_{1, n} \\
\vdots & \ddots & \vdots \\
x_{m, 1} & \ldots & x_{m, n}
\end{array}\right]=\left(\hat{\mathbf{t}}_{i}^{T}\right) \rightarrow\left[\left[\mathbf{u}_{1}\right] \ldots\left[\mathbf{u}_{k}\right]\right] \cdot\left[\begin{array}{ccc}
\sigma_{1} & \ldots & 0 \\
\vdots & \ddots & \vdots \\
0 & \ldots & \sigma_{k}
\end{array}\right] \cdot\left[\begin{array}{c}
{\left[\begin{array}{c}
\mathbf{v}_{1}
\end{array}\right]} \\
\vdots \\
{\left[\begin{array}{c}
\mathbf{v}_{k}
\end{array}\right]}
\end{array}\right] \\
d_{j}=U_{k} \Sigma_{k} \hat{d}_{j} \Longrightarrow \hat{d}_{j}=\Sigma_{k}^{-1} U_{k}^{T} d_{j}
\end{array}
$
Given a collection of documents, perform SVD and low-rank approximation to obtain $\Sigma_{k}$ and $U_{k}$
Given a document and a query, represent them as a vectors in the obtained "semantic" vector space
$
\begin{array}{l}
\hat{d}=\Sigma_{k}^{-1} U_{k}^{T} d \\
* \hat{q}=\Sigma_{k}^{-1} U_{k}^{T} q
\end{array}
$
Match the obtained "semantic" vector representations $\hat{d}$ and $\hat{q}$ using cosine similarity
---
## References