Compressed Sensing - Notes on AI

# Compressed Sensing A compressible signal $\mathrm{x} \in \mathbb{R}^{n}$ may be written as a sparse vector $\mathrm{s} \in \mathbb{R}^{n}$ (containing mostly zeros) in a transform basis $\Psi \in \mathbb{R}^{n \times n}$ $ \mathbf{x}=\Psi_{\mathrm{S}} $ Despite the considerable success of compression in real-world applications, it still relies on having access to full high-dimensional measurements. Compressed sensing turns the compression paradigm upside down - instead of collecting high-dimensional data just to compress and discard most of the information, it is instead possible to collect surprisingly few compressed or random measurements and then infer what the sparse representation is in the transformed basis. If a signal $x$ is $K$ -sparse in $\Psi$, then instead of measuring $x$ directly $(n$ measurements) and then compressing, it is possible to collect dramatically fewer randomly chosen or compressed measurements and then solve for the nonzero elements of $\mathrm{s}$ in the transformed coordinate system. The measurements $\mathbf{y} \in \mathbb{R}^{p},$ with $K<p \ll n$ are given by $ \mathbf{y}=\mathbf{C} \mathbf{x} $ The measurement matrix $\mathbf{C} \in \mathbb{R}^{p \times n}$ represents a set of $p$ linear measurements on the state $x$. The choice of measurement matrix $\mathbf{C}$ is of critical importance in compressed sensing. The goal of compressed sensing is to find the sparsest vector $s$ that is consistent with the measurements $\mathbf{y}$ : $ \mathbf{y}=\mathbf{C} \Psi \mathbf{s}=\Theta \mathbf{s} $ This system of equations in is underdetermined since there are infinitely many consistent solutions $s$. The sparsest solution $\hat{\mathrm{s}}$ satisfies the following optimization problem: $ \hat{\mathbf{s}}=\underset{\mathbf{s}}{\operatorname{argmin}}\|\mathbf{s}\|_{0} \text { subject to } \mathbf{y}=\mathbf{C} \Psi \mathbf{s} $ where $\|\cdot\|_{0}$ denotes the $\ell_{0}$ pseudo-norm, given by the number of nonzero entries; this is also referred to as the cardinality of $\mathrm{s}$. The optimization is non-convex, and in general the solution can only be found with a brute-force search that is combinatorial in n and K. Fortunately, under certain conditions on the measurement matrix C, it is possible to relax the optimization to a convex L1 minimization: $ \hat{\mathbf{s}}=\underset{\mathbf{s}}{\operatorname{argmin}}\|\mathbf{s}\|_{1} \text { subject to } \mathbf{y}=\mathbf{C} \Psi \mathbf{s} $ where $\|\cdot\|_{1}$ is the $\ell_{1}$ norm, given by $ \|\mathbf{s}\|_{1}=\sum_{k=1}^{n}\left|s_{k}\right| $ There are very specific conditions that must be met for the L1-minimization in to converge with high probability to the sparsest solution: 1. The measurement matrix $\mathrm{C}$ must be *incoherent* with respect to the sparsifying basis $\Psi$, meaning that the rows of $\mathrm{C}$ are not correlated with the columns of $\Psi$ 2. The number of measurements $p$ must be sufficiently large, on the order of $ p \approx \mathcal{O}(K \log (n / K)) \approx k_{1} K \log (n / K) $ The constant multiplier $k_{1}$ depends on how incoherent $\mathbf{C}$ and $\boldsymbol{\Psi}$ are. ![[compressed-sensing-figure.jpg]] --- ## References 1. Chapter 3, Data-Driven Science and Engineering, Brunton and Kutz 2. Brunton's video https://www.youtube.com/watch?v=SbU1pahbbkc