# Linear Algebra ## Basic Concepts ### Basis vectors Basis vectors are unit vectors that define the dimensions of the vector space. Any time we define vectors numerically, it depends on our implicit choice we are making about the basis vectors. $\hat{i}$ and $\hat{j}$ are the common basis vectors. Formally a basis vectors is a set of linearly independent (non-colinear) vectors that span the full space. ### Span of vectors The span of vectors $\vec{v}$ and $\vec{w}$ is the set of all linear combinations $a\vec{v} + b\vec{w}$, where $a, b \in \mathbb{R}$. Span gives the answer to the question: "What are all the possible vectors you can reach with only scalar multiplication and vector addition?" If $\vec{v}$ and $\vec{w}$ "line up" i.e. are pointing in the same direction, their span is just a line. If not, its the plane. If they "line up", they are formally called "Linearly dependent". ### Linear transformations and matrices An $n\times n$ matrix is a compact representation of linear transformation in $n$ dimensional vector space. $ \begin{bmatrix} 3 & 2 \\ -2 & 1 \end{bmatrix} $ Here $[3,-2]^T$ and $[2,1]^T$ represents the new position of basis vector $i$ and $j$ respectively when the above transformation matrix is applied. Matrices can be multiplied to compose transformations into a single matrix. The transformation happens form right to left, for ex the combined transformation $\mathbf{A}.\mathbf{B}.\mathbf{C}$ is equivalent to transforming using $\mathbf{C}$, then $\mathbf{B}$ and then $\mathbf{A}$. This is why order of matrix multipliaction matters. Thus matrices can be thought of linear operators of vectors. ![[transformation-matrices.jpg]] Note: Affine transformations are combinations of linear transformations and translations. In practice, homogenous forms are also used to define transformations as 3x3 matrices: ![[homogenous-transformations.jpg]] ### Matrix-vector operation properties $ \begin{aligned} (\mathbf{A B})^{-1} & =\mathbf{B}^{-1} \mathbf{A}^{-1} \\ (\mathbf{A B C} \ldots)^{-1} & =\ldots \mathbf{C}^{-1} \mathbf{B}^{-1} \mathbf{A}^{-1} \\ \left(\mathbf{A}^T\right)^{-1} & =\left(\mathbf{A}^{-1}\right)^T \\ (\mathbf{A}+\mathbf{B})^T & =\mathbf{A}^T+\mathbf{B}^T \\ (\mathbf{A B})^T & =\mathbf{B}^T \mathbf{A}^T \\ (\mathbf{A B C} \ldots)^T & =\ldots \mathbf{C}^T \mathbf{B}^T \mathbf{A}^T \\ \left(\mathbf{A}^H\right)^{-1} & =\left(\mathbf{A}^{-1}\right)^H \\ (\mathbf{A}+\mathbf{B})^H & =\mathbf{A}^H+\mathbf{B}^H \\ (\mathbf{A B})^H & =\mathbf{B}^H \mathbf{A}^H \\ (\mathbf{A B C} \ldots)^H & =\ldots \mathbf{C}^H \mathbf{B}^H \mathbf{A}^H \end{aligned} $ ### Inverse of Matrices The inverse $\mathbf{A}^{-1}$ of a matrix $\mathbf{A} \in \mathbb{C}^{n \times n}$ is defined such that $ \mathbf{A A}^{-1}=\mathbf{A}^{-1} \mathbf{A}=\mathbf{I}, $ where $\mathbf{I}$ is the $n \times n$ identity matrix. If $\mathbf{A}^{-1}$ exists, $\mathbf{A}$ is said to be nonsingular. Otherwise, $\mathbf{A}$ is said to be singular. Some properties: $ (\mathbf{A B})^{-1}=\mathbf{B}^{-1} \mathbf{A}^{-1} $ For any square matrix $A \in \mathbb{R}^{n \times n}$ it holds that $A$ is invertible if and only if $\operatorname{det}(\boldsymbol{A}) \neq 0$. ### Rank When matrix transforms the space, the number of dimensions of the new space is given by the rank of the matrix. Example: if a $3\times 3$ matrix has rank 2, this matrix squishes the 3D space into 2 dimensions. The set of vectors that land on origin when the matrix transform happens is called the kernel of a matrix. ### Change of basis We can change the basis of a space using change of basis matrix. Let's say $A$ is a matrix with it's columns as the new basis vectors. Then vectors in the new space $\vec{v}_{new}$ is given by $ \mathbf{v}_{new}= \mathbf{A}^{-1}\mathbf{v} $ $\mathbf{v}$ is the vector in the old space. This matrix $\mathbf{A}$ is called change of basis matrix. Let's say we have a matrix $\mathbf{D}$ representing a transformation in the old space. We can change it to new space by $\mathbf{A}^{-1} \mathbf{D} \mathbf{A}$. ### Eigenvalues and eigenvectors Most vectors get knocked off from their span when undergoing transforms (i.e. when being multiplied with matrices), for example during rotation. But some vectors may remain on their span (none in rotation), and only get scaled. These vectors are called the eigenvectors of the matrix. The factor with which the eigenvector get squished or stretched is called the eigenvalue of that eigenvector. Let's say we have a matrix $\mathbf{A}$ that represents a linear transformation. Then $\vec{v}$ represents the eigen vector and $\lambda$ it's eigen value when $ \mathbf{A}\vec{v} = \lambda \vec{v} $ For eigenvalues, we have $ \text{det}(\mathbf{A} - \lambda \mathbf{I}) = 0 $ this expression can be used to solve for possible eigen values of the matrix $\mathbf{A}$. **Note**: Eigenvalues aren't defined for rectangular matrices, but the singular values are closely related: The right and left singular values for rectangular matrix M are the eigenvalues of M'M and MM'. For example, let $A=\left[\begin{array}{ll} -5 & 2 \\ -7 & 4 \end{array}\right]$. First we find eigenvalues of A by solving the equation $\operatorname{det}(\lambda I-A)=0$. This gives: $ \begin{aligned} \operatorname{det}\left(\lambda\left[\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right]-\left[\begin{array}{ll} -5 & 2 \\ -7 & 4 \end{array}\right]\right) & =0 \\ \operatorname{det}\left[\begin{array}{cc} \lambda+5 & -2 \\ 7 & \lambda-4 \end{array}\right] & =0 \end{aligned} $ Computing determinant as usual, the result is $\lambda^2+\lambda-6=0$. Solving this equation, we find that $\lambda_1=2$ and $\lambda_2=-3$. Now we need to find the basic eigenvectors for each $\lambda$. First we will find the eigenvectors for $\lambda_1=2$. We wish to find all vectors $X \neq 0$ such that $A X=2 X$. These are the solutions to $(2 I-A) X=0$. $ \begin{array}{r} \left(2\left[\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right]-\left[\begin{array}{ll} -5 & 2 \\ -7 & 4 \end{array}\right]\right)\left[\begin{array}{l} x \\ y \end{array}\right]=\left[\begin{array}{l} 0 \\ 0 \end{array}\right] \\ {\left[\begin{array}{ll} 7 & -2 \\ 7 & -2 \end{array}\right]\left[\begin{array}{l} x \\ y \end{array}\right]=\left[\begin{array}{l} 0 \\ 0 \end{array}\right]} \end{array} $ This gives the basic eigenvector for $\lambda_1=2$ as $\left[\begin{array}{l}2 \\ 7\end{array}\right]$. ### Trace Trace is the sum of eigen values of a matrix. Some properties of trace are: $ \begin{aligned} \operatorname{Tr}(\mathbf{A}) & =\sum_i A_{i i} \\ \operatorname{Tr}(\mathbf{A}) & =\sum_i \lambda_i, \quad \lambda_i=\operatorname{eig}(\mathbf{A}) \\ \operatorname{Tr}(\mathbf{A}) & =\operatorname{Tr}\left(\mathbf{A}^T\right) \\ \operatorname{Tr}(\mathbf{A B}) & =\operatorname{Tr}(\mathbf{B A}) \\ \operatorname{Tr}(\mathbf{A}+\mathbf{B}) & =\operatorname{Tr}(\mathbf{A})+\operatorname{Tr}(\mathbf{B}) \\ \operatorname{Tr}(\mathbf{A B C}) & =\operatorname{Tr}(\mathbf{B C A})=\operatorname{Tr}(\mathbf{C A B}) \\ \mathbf{a}^T \mathbf{a} & =\operatorname{Tr}\left(\mathbf{a a}^T\right) \end{aligned} $ ### Determinant When a matrix applies transformation to a space, how much is the area scaled by this transformation is given by it's determinant. If determinant of matrix is 0, it squishes the space into a smaller dimension than the input dimension. Negative determinant mean the orientation of the transformed space is inverted. Determinant can be decomposed as $det(\mathbf{U}\mathbf{V})= det(\mathbf{U})det(\mathbf{V})$. Determinants of inverse matrix is equal to inverse of determinant of non-inverse matrix i.e $\operatorname{det}\left(\mathbf{U}^{-1}\right)=\operatorname{det}(\mathbf{U})^{-1}$. Some properties of determinants are: Let $\mathbf{A}$ be an $n \times n$ matrix. $ \begin{aligned} \operatorname{det}(\mathbf{A}) & =\prod_i \lambda_i \quad \lambda_i=\operatorname{eig}(\mathbf{A}) \\ \operatorname{det}(c \mathbf{A}) & =c^n \operatorname{det}(\mathbf{A}), \quad \text { if } \mathbf{A} \in \mathbb{R}^{n \times n} \\ \operatorname{det}\left(\mathbf{A}^T\right) & =\operatorname{det}(\mathbf{A}) \\ \operatorname{det}(\mathbf{A B}) & =\operatorname{det}(\mathbf{A}) \operatorname{det}(\mathbf{B}) \\ \operatorname{det}\left(\mathbf{A}^{-1}\right) & =1 / \operatorname{det}(\mathbf{A}) \\ \operatorname{det}\left(\mathbf{A}^n\right) & =\operatorname{det}(\mathbf{A})^n \\ \operatorname{det}\left(\mathbf{I}+\mathbf{u v ^ { T }}\right) & =1+\mathbf{u}^T \mathbf{v} \end{aligned} $ ### The special case of 2x2 matrix Consider the matrix $\mathbf{A}$ $ \mathbf{A}=\left[\begin{array}{ll} A_{11} & A_{12} \\ A_{21} & A_{22} \end{array}\right] $ Determinant and trace $ \begin{gathered} \operatorname{det}(\mathbf{A})=A_{11} A_{22}-A_{12} A_{21} \\ \operatorname{Tr}(\mathbf{A})=A_{11}+A_{22} \end{gathered} $ Eigenvalues $ \lambda^2-\lambda \cdot \operatorname{Tr}(\mathbf{A})+\operatorname{det}(\mathbf{A})=0 $ $ \begin{array}{cl} \lambda_1=\frac{\operatorname{Tr}(\mathbf{A})+\sqrt{\operatorname{Tr}(\mathbf{A})^2-4 \operatorname{det}(\mathbf{A})}}{2} & \lambda_2=\frac{\operatorname{Tr}(\mathbf{A})-\sqrt{\operatorname{Tr}(\mathbf{A})^2-4 \operatorname{det}(\mathbf{A})}}{2} \\ \lambda_1+\lambda_2=\operatorname{Tr}(\mathbf{A}) & \lambda_1 \lambda_2=\operatorname{det}(\mathbf{A}) \end{array} $ Eigenvectors $ \mathbf{v}_1 \propto\left[\begin{array}{c} A_{12} \\ \lambda_1-A_{11} \end{array}\right] \quad \mathbf{v}_2 \propto\left[\begin{array}{c} A_{12} \\ \lambda_2-A_{11} \end{array}\right] $ Inverse $ \mathbf{A}^{-1}=\frac{1}{\operatorname{det}(\mathbf{A})}\left[\begin{array}{cc} A_{22} & -A_{12} \\ -A_{21} & A_{11} \end{array}\right] $ ## Special Matrices ### Symmetric matrices A matrix is called symmetric if $A = A^{T}$. ### Orthogonal matrices A square matrix is called orthogonal if $Q^{-1} = Q^T$. Basic properties for the orthogonal matrix $\mathbf{Q}$ $ \begin{aligned} \mathbf{Q}^{-1} & =\mathbf{Q}^T \\ \mathbf{Q}^{-T} & =\mathbf{Q} \\ \mathbf{Q Q}^T & =\mathbf{I} \\ \mathbf{Q}^T \mathbf{Q} & =\mathbf{I} \\ \operatorname{det}(\mathbf{Q}) & = \pm 1 \end{aligned} $ ### Diagonalizable matrices If your transformation (matrix) has lots of eigenvectors, enough so you can choose a set of vectors that spans a full space, then you could change the basis of your coordinate system to these eigenvectors. These new basis vectors are called "Eigenbasis". This has as a nice consequence of guaranteeing that the new transformation matrix is always a diagonal matrix. In this new diagonal matrix, all the columns represent eigenvectors with the values in the diagonal as their eigen values. For example, let's say $\mathbf{A}$ is the new change of basis matrix, with eigenvectors as its columns. Then our transformation matrix $\mathbf{D}$ can be factorized in the new space as $\mathbf{A}^{-1}\mathbf{D}\mathbf{A}$, where $\mathbf{D}$ is guaranteed to be a diagonal matrix of eigen vectors. Diagonal matrix are much easier to handle as the matrix multiplication simply boils down to raising the power of basis vectors to N number of times. ### Orthogonally diagonalizable matrices An $n \times n$ matrix $A$ is said to be orthogonally diagonalizable when an orthogonal matrix $P$ can be found such that $P^{-1} A P=P^T A P$ is a diagonal matrix. ### Positive definite and Semi-positive definite matrices - A square matrix is called positive definite if it is symmetric and all its eigenvalues λ are positive, that is λ > 0. These matrices, which arise whenever optimization (maximum and minimum) problems are encountered. - If A is positive definite, then it is invertible and det(A) > 0. - If all $\lambda \geq 0$, then the matrix is called positive semi-definite. ### Toeplitz Matrices A Toeplitz matrix $\mathbf{T}$ is a matrix where the elements of each diagonal is the same. In the $n \times n$ square case, it has the following structure: $ \mathbf{T}=\left[\begin{array}{cccc} t_{11} & t_{12} & \cdots & t_{1 n} \\ t_{21} & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & t_{12} \\ t_{n 1} & \cdots & t_{21} & t_{11} \end{array}\right]=\left[\begin{array}{cccc} t_0 & t_1 & \cdots & t_{n-1} \\ t_{-1} & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & t_1 \\ t_{-(n-1)} & \cdots & t_{-1} & t_0 \end{array}\right] $ The Toeplitz matrix has some computational advantages. The addition of two Toeplitz matrices can be done with $\mathcal{O}(n)$ flops, multiplication of two Toeplitz matrices can be done in $\mathcal{O}(n \ln n)$ flops. Toeplitz equation systems can be solved in $\mathcal{O}\left(n^2\right)$ flops. The inverse of a positive definite Toeplitz matrix can be found in $\mathcal{O}\left(n^2\right)$ flops too. The inverse of a Toeplitz matrix is persymmetric. The product of two lower triangular Toeplitz matrices is a Toeplitz matrix. ### Vandermonde Matrices A Vandermonde matrix has the form: $ \mathbf{V}=\left[\begin{array}{ccccc} 1 & v_1 & v_1^2 & \cdots & v_1^{n-1} \\ 1 & v_2 & v_2^2 & \cdots & v_2^{n-1} \\ \vdots & \vdots & \vdots & & \vdots \\ 1 & v_n & v_n^2 & \cdots & v_n^{n-1} \end{array}\right] $ The transpose of $\mathbf{V}$ is also said to a Vandermonde matrix. ## Matrix Factorizations The following are known as the big six matrix factorizations: 1. [[Cholesky Factorization]] 2. [[LU Factorization]] 3. [[QR Factorization]] 4. [[Schur Decomposition]] 5. [[Eigen Decomposition]] 6. [[Singular Value Decomposition (SVD)]] Other factorizations/decompositions - [[Butterfly Factorization]] ## [[Matrix Calculus]] --- ### References 1. Orthogonal Diagonalizable matrices https://www.math.wustl.edu/~freiwald/309orthogdiag.pdf 2. 3Blue1Brown Essence of Linear Algebra https://www.youtube.com/watch?v=FCmH4MqbFGs 3. The Big Picture of Linear Algebra by Prof. Gilbert Strang https://www.youtube.com/watch?v=ggWYkes-n6E 4. 1. https://nhigham.com/2022/05/18/the-big-six-matrix-factorizations/ 5. The Art of Linear Algebra, Kenji Hirannbe 2021 6. Orthogonal Diagonalization https://math.emory.edu/~lchen41/teaching/2020_Fall/Section_8-2.pdf 7. The Matrix Cookbook https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf