Source: http://onmyphd.com/?p=eigen.decomposition

- Basics of linear algebra

- Eigenwhat?
- Eigendecomposition
- Why is eigendecomposition useful?
- Properties of eigendecomposition
- How to compute eigendecomposition?

Eigen means own or self. In linear algebra, eigenvalue, eigenvector and eigendecomposition are terms that are intrinsically related. Eigendecomposition is the method to decompose a square matrix into its eigenvalues and eigenvectors. For a matrix $A$, if $$\begin{equation}A\mathbf{v}=\lambda \mathbf{v}\label{eq:Avlv}\end{equation}$$ then $\mathbf{v}$ is an eigenvector of matrix $A$ and $\lambda$ is the corresponding eigenvalue. That is, if matrix $A$ is multiplied by a vector and the result is a scaled version of the same vector, then it is an eigenvector of $A$ and the scaling factor is its eigenvalue.

So how do we find the eigenvectors of a matrix? From $\eqref{eq:Avlv}$: $$A\mathbf{v}-\lambda I \mathbf{v} = 0$$ $$\begin{equation}(A -\lambda I) \mathbf{v} = 0\label{eq:AlI}\end{equation},$$ where $I$ is the identity matrix. The values of $\lambda$ where $\eqref{eq:AlI}$ holds are the eigenvalues of $A$. It turns out that this equation is equivalent to: $$\begin{equation}det(A-\lambda I) = 0,\label{eq:detAlI}\end{equation}$$ where det() is the determinant of a matrix.

Proof that $det(A-\lambda I) \equiv (A-\lambda I) \mathbf{v}=0$

First, you must know that a matrix is non-invertible if and only if its determinant is zero. So, for the values of $\lambda$ that $\eqref{eq:detAlI}$ holds, $A-\lambda I$ is non-invertible (singular). In those cases, you cannot left-multiply both sides of $\eqref{eq:AlI}$ by $(A-\lambda I)^{-1}$ (since there is no inverse) to get:
$$\mathbf{v} = 0,$$
which means that in those cases, the solution for $\eqref{eq:Avlv}$ is different from $\mathbf{v} = 0$ and $\lambda$ is an eigenvalue of $A$.

Let's see the eigendecomposition for the matrix:
$$A=\left[\begin{array}{cc}1 & 0\\1 & 3\\\end{array}\right]$$
From $\eqref{eq:detAlI}$:
$$det\left(\left[\begin{array}{cc}1-\lambda & 0\\1 & 3-\lambda\\\end{array}\right]\right) = 0$$
$$(1-\lambda)(3-\lambda) = 0$$
we get directly $\lambda_1 = 1$ and $\lambda_2 = 3$. The above expression is usually referred as the **characteristic polinomial** or **characteristic equation** of a matrix.

Plugging $\lambda_1$ into $\eqref{eq:Avlv}$, we get:
$$\left[\begin{array}{cc}1 & 0\\1 & 3\\\end{array}\right]\left[\begin{array}{c}v_{11}\\v_{12}\\\end{array}\right]= 1 \left[\begin{array}{c}v_{11}\\v_{12}\\\end{array}\right]$$
from which we get $v_{11} = -2v_{12}$. That is, any vector $\mathbf{v_1} = [v_{11}, v_{12}]$ where $v_{11} = -2v_{12}$ is an eigenvector of $A$ with eigenvalue 1.

Plugging $\lambda_2$ into $\eqref{eq:Avlv}$, we get:
$$\left[\begin{array}{cc}1 & 0\\1 & 3\\\end{array}\right]\left[\begin{array}{c}v_{21}\\v_{22}\\\end{array}\right]= 3 \left[\begin{array}{c}v_{21}\\v_{22}\\\end{array}\right]$$
from which we get $v_{21} = 0$ and $v_{22} \in \mathbb{R}$. That is, any vector $\mathbf{v_2} = [v_{21}, v_{22}]$ where $v_{21} = 0$ is an eigenvector of $A$ with eigenvalue 3.

Referring to our previous example, we can join both eigenvectors and eigenvalues in a single matrix equation:
$$A\left[\mathbf{v_1 v_2}\right] = \left[\begin{array}{cc}1 & 0\\1 & 3\\\end{array}\right]\left[\begin{array}{cc}v_{11} & v_{21}\\v_{12} & v_{22}\\\end{array}\right] =\left[\begin{array}{cc}v_{11} & v_{21}\\v_{12} & v_{22}\\\end{array}\right]\left[\begin{array}{cc}\lambda_1 & 0\\0 & \lambda_2\\\end{array}\right] =\left[\mathbf{v_1 v_2}\right]\left[\begin{array}{cc}\lambda_1 & 0\\0 & \lambda_2\\\end{array}\right]$$
If we replace:
$$\Lambda = \left[\begin{array}{cc}\lambda_1 & 0\\0 & \lambda_2\\\end{array}\right]$$
$$V = \left[\mathbf{v_1 v_2}\right]$$
it is also true that:
$$AV = V\Lambda$$
$$\begin{equation}A = V\Lambda V^{-1}\label{eq:AVLV}\end{equation}$$
**Eigendecomposition decomposes a matrix $A$ into a multiplication of a matrix of eigenvectors $V$ and a diagonal matrix of eigenvalues $\Lambda$.** This can only be done if a matrix is **diagonalizable**. In fact, the definition of a diagonalizable matrix $A \in \mathbb{R}^{n \times n}$ is that it can be eigendecomposed into $n$ eigenvectors, so that $V^{-1}AV = \Lambda$.

From $\eqref{eq:AVLV}$: $$A^{-1} = V \Lambda^{-1}V^{-1}$$ The inverse of $\Lambda$ is just the inverse of each diagonal element (the eigenvalues).

From $\eqref{eq:AVLV}$: $$A^2 = V \Lambda V^{-1} V \Lambda V^{-1} = V \Lambda^{2} V^{-1}$$ $$A^n = V \Lambda^n V^{-1}$$ The power of $\Lambda$ is just the power of each diagonal element. This becomes much simpler than multiplications of A.

- $det(A)=\prod_{i=1}^{n}\lambda_i$ (the determinant of A is equal to the product of its eigenvalues)
- $tr(A)=\sum_{i=1}^{n}\lambda_i$ (the trace of A is equal to the sum of its eigenvalues)
- The eigenvalues of $A^{-1}$ are $\lambda_i^{-1}$
- The eigenvalues of $A^{n}$ are $\lambda_i^{n}$
- In general, the eigenvalues of $f(A)$ are $f(\lambda_i)$
- The eigenvectors of $A^{-1}$ are the same as the eigenvectors of $A$.
- if $A$ is hermitian (its conjugate transpose is equal to itself) and full-rank (all rows or columns are linearly independent), then the eigenvectors are mutually orthogonal (the dot-product between any two eigenvectors is zero) and the eigenvalues are real.
- $A$ is invertible if all its eigenvalues are different from zero and vice-versa.
- If the eigenvalues of matrix $A$ are distinct (not repeated), then A can be eigendecomposed.

Calculating the characteristic polinomial and then solving it with the respect to the eigenvalues becomes impractical as the size of the matrix increases. In practice, iterative algorithms are used to eigendecompose a matrix.

Power iteration is an iterative method to calculate the highest eigenvalue and its associated eigenvector. Only the highest value/vector is found, so this method as limited use.

First, we start with some vector $b_0$, which can be an educated guess of the dominant eigenvector or a random vector. Then, iterate through the following equation: $$b_{k+1} = \frac{A b_k}{\left\Vert A b_k \right\Vert}.$$ At each iteration, the vector is left-multiplied by matrix $A$ and normalized, converging to the dominant eigenvector. This method only works if:

- $A$ has an eigenvalue greater or equal to all others.
- Vector $b_0$ has a nonzero component in the direction of the dominant eigenvector (i.e. their dot-product is different from zero)

Using our example matrix $A$ and initial vector: $$b_0 = \left[\begin{array}{c}1\\1\\\end{array}\right]$$ For the first step: $$b_1 = \frac{\left[\begin{array}{cc}1 & 0\\1 & 3\\\end{array}\right]\left[\begin{array}{c}1\\1\\\end{array}\right]}{\left\Vert\left[\begin{array}{cc}1 & 0\\1 & 3\\\end{array}\right]\left[\begin{array}{c}1\\1\\\end{array}\right]\right\Vert}= \frac{\left[\begin{array}{c}1\\4\\\end{array}\right]}{5} = \left[\begin{array}{c}0.2\\0.8\\\end{array}\right]$$ For the next steps, reuse the last $b$ and: $$b_2= \left[\begin{array}{c}0.07\\0.93\\\end{array}\right], b_3= \left[\begin{array}{c}0.02\\0.98\\\end{array}\right], b_4=\left[\begin{array}{c}0.01\\0.99\\\end{array}\right], b_5= \left[\begin{array}{c}0\\1\\\end{array}\right]$$ and $$ \left\Vert A b_5 \right\Vert = 2.99$$ If you recall, the highest eigenvalue of $A$ is 3 and its eigenvector is $\mathbf{v} = [ v_{21}, v_{22}]$, where $v_{21} = 0$ and $v_{22}$ can have any value.

The QR algorithm uses the QR decomposition iteratively to make the eigendecomposition. Recall that the QR decomposition decomposes a matrix $A$ into an orthogonal matrix $Q$ and an upper triangular matrix $R$ as $A = QR$.