Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update eigenvalue page #10

Merged
merged 13 commits into from
Feb 23, 2024
72 changes: 54 additions & 18 deletions notes/eigen.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,25 +38,30 @@ $$ |\lambda_1| \geq |\lambda_2| \geq \cdots \geq |\lambda_n|, $$

and we normalize eigenvectors, so that $$\|{\bf x}\| = 1$$.
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

## Eigenvalues of a Shifted Matrix

Given a matrix $$\mathbf{A}$$, for any constant scalar $$\sigma$$, we define the **_shifted matrix_** is $$\mathbf{A} - \sigma {\bf I}$$. If $$\lambda$$ is an eigenvalue of $$\mathbf{A}$$ with eigenvector $${\bf x}$$ then $$\lambda - \sigma$$ is an eigenvalue of the shifted matrix with the same eigenvector. This can be derived by

$$ \begin{aligned} (\mathbf{A} - \sigma {\bf I}) {\bf x} &= \mathbf{A} {\bf x} - \sigma {\bf I} {\bf x} \\ &= \lambda {\bf x} - \sigma {\bf x} \\ &= (\lambda - \sigma) {\bf x}. \end{aligned} $$
#### Example

## Eigenvalues of an Inverse
First, we find the eigenvalues by solving for the characteristic polynomial.

An invertible matrix cannot have an eigenvalue equal to zero. Furthermore, the eigenvalues of the inverse matrix are equal to the inverse of the eigenvalues of the original matrix:
$$ \bf{A}=\begin{bmatrix} 2 & 1 \\ 4 & 2 \end{bmatrix} \qquad \text{det}(\bf{A}- \bf{I}\lambda)=p(\lambda)=(2-\lambda)^2-4 \rightarrow \lambda_1=4, \lambda_2=0$$

$$ \mathbf{A} {\bf x} = \lambda {\bf x}\implies \\ \mathbf{A}^{-1} \mathbf{A} {\bf x} = \lambda \mathbf{A}^{-1} {\bf x} \implies \\ {\bf x} = \lambda \mathbf{A}^{-1} {\bf x}\implies \\ \mathbf{A}^{-1} {\bf x} = \frac{1}{\lambda} {\bf x}.$$
Second, we find the eigenvectors for each eigenvalue by solving for the trivial solution (the nullspace) of $$\bf A-\bf I\lambda$$. **Note** any multiple of $$\bf x$$ below is a valid eigenvector to its eigenvalue.

## Eigenvalues of a Shifted Inverse
$$\lambda_1: \begin{bmatrix} 2 & 1 \\ 4 & 2 \end{bmatrix}\bf x=0 \rightarrow \bf{x}=\begin{bmatrix} 1 \\ 2 \end{bmatrix} \qquad \lambda_1: \begin{bmatrix} -2 & 1 \\ 4 & -2 \end{bmatrix}\bf{x}=0 \rightarrow \bf{x}=\begin{bmatrix} 1 \\ -2 \end{bmatrix}$$
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

Similarly, we can describe the eigenvalues for shifted inverse matrices as:
#### Code Example
The following code snippet finds and prints the eigenvalues and corresponding eigenvectors of a matrix. Take careful note that eigenvectors are stored as columns of a 2d numpy array.
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

$$ (\mathbf{A} - \sigma {\bf I})^{-1} {\bf x} = \frac{1}{\lambda - \sigma} {\bf x}.$$

It is important to note here, that the eigenvectors remain unchanged for shifted or/and inverted matrices.
```python
import numpy as np
import numpy.linalg as la
def solve(A):
# A: nxn matrix
evals, evecs = la.eig(A)
for ev in evals:
print(ev)
for i in range(np.shape(evecs)[1]): # print column-wise
print(evecs[:, i])
```

## Diagonalizability

Expand Down Expand Up @@ -93,6 +98,25 @@ the matrix $$\mathbf{X}$$ does not have an inverse, so we cannot diagonalize $$\
* The eigenvalues of an $$n \times n$$ matrix are not necessarily unique. In fact, we can define the multiplicity of an eigenvalue.
* If an $$n \times n$$ matrix has $$n$$ linearly independent eigenvectors, then the matrix is diagonalizable.
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

## Eigenvalues of a Shifted Matrix
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

Given a matrix $$\mathbf{A}$$, for any constant scalar $$\sigma$$, we define the **_shifted matrix_** is $$\mathbf{A} - \sigma {\bf I}$$. If $$\lambda$$ is an eigenvalue of $$\mathbf{A}$$ with eigenvector $${\bf x}$$ then $$\lambda - \sigma$$ is an eigenvalue of the shifted matrix with the same eigenvector. This can be derived by

$$ \begin{aligned} (\mathbf{A} - \sigma {\bf I}) {\bf x} &= \mathbf{A} {\bf x} - \sigma {\bf I} {\bf x} \\ &= \lambda {\bf x} - \sigma {\bf x} \\ &= (\lambda - \sigma) {\bf x}. \end{aligned} $$

## Eigenvalues of an Inverse

An invertible matrix cannot have an eigenvalue equal to zero. Furthermore, the eigenvalues of the inverse matrix are equal to the inverse of the eigenvalues of the original matrix:
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

$$ \mathbf{A} {\bf x} = \lambda {\bf x}\implies \\ \mathbf{A}^{-1} \mathbf{A} {\bf x} = \lambda \mathbf{A}^{-1} {\bf x} \implies \\ {\bf x} = \lambda \mathbf{A}^{-1} {\bf x}\implies \\ \mathbf{A}^{-1} {\bf x} = \frac{1}{\lambda} {\bf x}.$$

## Eigenvalues of a Shifted Inverse

Similarly, we can describe the eigenvalues for shifted inverse matrices as:

$$ (\mathbf{A} - \sigma {\bf I})^{-1} {\bf x} = \frac{1}{\lambda - \sigma} {\bf x}.$$

It is important to note here, that the eigenvectors remain unchanged for shifted or/and inverted matrices.
pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved

## Expressing an Arbitrary Vector as a Linear Combination of Eigenvectors

Expand Down Expand Up @@ -285,19 +309,30 @@ $$ \mathbf{e}_{k+1} \approx \frac{|\lambda_2|}{|\lambda_1|}\mathbf{e}_k$$.
The convergence rate for (shifted) inverse iteration is also linear, but now depends on the two closest eigenvalues to the shift $$\sigma$$. (Remember that standard inverse iteration corresponds to a shift $$\sigma = 0$$.) The recurrence relationship for the errors is given by:
$$ \mathbf{e}_{k+1} \approx \frac{|\lambda_\text{closest} - \sigma|}{|\lambda_\text{second-closest} - \sigma|}\mathbf{e}_k.$$

## Cost Summary
## Cost and Convergence Summary

| Method | Description | Cost | Convergence |
|--------------------------------|-----------------------------------------------------------------------------------------|------------------|-------------------------------------------------------|
| Power Method | $$\boldsymbol{x}_{k+1} = \mathbf{A} \boldsymbol{x}_{k}$$ | $$kn^2$$ | $$\left\|\frac{\lambda_2}{\lambda_1}\right\|$$ |
| Inverse Power Method | $$\mathbf{A} \boldsymbol{x}_{k+1} = \boldsymbol{x}_{k}$$ | $$n^{3} + kn^2$$ | $$\left\|\frac{\lambda_n}{\lambda_{n-1}}\right\|$$ |
| Shifted Inverse Power Method | $$(\mathbf{A} - \sigma \mathbf{I}) \boldsymbol{x}_{k+1} = \boldsymbol{x}_{k}$$ | $$n^{3} + kn^2$$ | $$\left\|\frac{\lambda_c-\sigma}{\lambda_{c2}-\sigma}\right\|$$ |

pascaladhikary marked this conversation as resolved.
Show resolved Hide resolved
(a) Power Method $$\boldsymbol{x}_{k+1} = \mathbf{A} \boldsymbol{x}_{k}$$, the cost is $$kn^2$$. \\
(b) Inverse Power Method $$\mathbf{A} \boldsymbol{x}_{k+1} = \boldsymbol{x}_{k}$$, the cost is $$n^{3} + kn^2$$. \\
(c) Shifted Inverse Power Method $$(\mathbf{A} - \sigma \mathbf{I}) \boldsymbol{x}_{k+1} = \boldsymbol{x}_{k}$$, the cost is $$n^{3} + kn^2$$.

$$\lambda_1$$: largest eigenvector (in magnitude) \\
$$\lambda_2$$: second largest eigenvector (in magnitude) \\
$$\lambda_n$$: smallest eigenvector (in magnitude) \\
$$\lambda_{n-1}$$: second smallest eigenvector (in magnitude) \\
$$\lambda_c$$: closest eigenvector to $$\sigma$$ \\
$$\lambda_{c2}$$: second closest eigenvector to $$\sigma$$

## Orthogonal Matrices

Square matrices are called **_orthogonal_** if and only if the columns are mutually orthogonal to one another and have a norm of <span>$$1$$</span> (such a set of vectors are formally known as an **_orthonormal_** set), i.e.:
$$\boldsymbol{c}_i^T \boldsymbol{c}_j = 0 \quad \forall \ i \neq j, \quad \|\boldsymbol{c}_i\| = 1 \quad \forall \ i \iff \mathbf{A} \in \mathcal{O}(n),$$
or
$$ \langle\boldsymbol{c}_i,\boldsymbol{c}_j \rangle = \begin{cases} 0 \quad \mathrm{if} \ i \neq j, \\ 1 \quad \mathrm{if} \ i = j \end{cases} \iff \mathbf{A} \in \mathcal{O}(n),$$
where $$\mathcal{O}(n)$$ is the set of all $$n \times n$$ orthogonal matrices called the orthogonal group, $$\boldsymbol{c}_i$$, $$i=1, \dots, n$$, are the columns of <span>$$\mathbf{A}$$</span>, and $$\langle \cdot, \cdot \rangle$$ is the inner product operator. Orthogonal matrices have many desirable properties:\\
where $$\mathcal{O}(n)$$ is the set of all $$n \times n$$ orthogonal matrices called the orthogonal group, $$\boldsymbol{c}_i$$, $$i=1, \dots, n$$, are the columns of <span>$$\mathbf{A}$$</span>, and $$\langle \cdot, \cdot \rangle$$ is the inner product operator. Orthogonal matrices have many desirable properties:

(a) $$ \mathbf{A}^T \in \mathcal{O}(n) $$\\
(b) $$ \mathbf{A}^T \mathbf{A} = \mathbf{A} \mathbf{A}^T = \mathbf{I} \implies \mathbf{A}^{-1} = \mathbf{A}^T $$\\
(c) $$ \det{\mathbf{A}} = \pm 1 $$\\
Expand All @@ -315,6 +350,7 @@ where $$\langle \cdot, \cdot \rangle$$ is the inner product operator. Each of th

## ChangeLog

* 2024-02-11 Pascal Adhikary <[email protected]>: add ev examples, cost table, reorganize
* 2022-02-28 Yuxuan Chen <[email protected]>: added learning objectives, cost summary
* 2020-03-01 Peter Sentz: added text to include content from slides
* 2018-10-14 Erin Carrier <[email protected]>: removes orthogonal/GS sections
Expand Down