jc4.tex

% Chapter 4, Section 4 _Linear Algebra_ Jim Hefferon
%  http://joshua.smcvt.edu/linearalgebra
%  2001-Jun-12
\section{Jordan Form}
\index{Jordan form|(}
\noindent\textit{This section uses material from three optional
subsections:~Combining Subspaces, Determinants Exist, and
Laplace's Expansion.}

We began this chapter
by remembering that every linear map $\map{h}{V}{W}$ can
be represented by a partial identity matrix with respect to some
bases $B\subset V$ and $D\subset W$.
That is, the partial identity form is a canonical form for 
matrix equivalence. 
This chapter considers transformations, where
the codomain equals the domain, so 
we naturally ask what is possible when the
two bases are equal $\rep{t}{B,B}$.
In short, we want a canonical form for matrix similarity.

We noted that in the $B,B$ case
a partial identity matrix is not always possible.
We therefore extended the matrix 
forms of interest to the natural generalization, 
diagonal matrices, and
showed that a transformation or square matrix can be diagonalized
if its eigenvalues are distinct.
But at the same time we gave an example of a 
square matrix that cannot be diagonalized
(because it is nilpotent)
and thus diagonal form won't suffice as the canonical form 
for matrix similarity. 

The prior section developed that example
to get a canonical form, subdiagonal ones, for nilpotent matrices.

This section finishes our program by
showing that for any linear transformation there is
a basis such that the matrix representation $\rep{t}{B,B}$ is the sum of a 
diagonal matrix and a nilpotent matrix.
This is Jordan canonical form.


\subsectionoptional{Polynomials of Maps and Matrices}\index{polynomial!of map, matrix}
Recall that the set of square matrices~\( \matspace_{\nbyn{n}} \)
is a vector space under entry-by-entry addition and scalar multiplication,
and that 
%the unit matrices\Dash all entries are zero except
%for a single entry, which is one\Dash form a basis, so 
this space has dimension \( n^2 \).
Thus, for any \( \nbyn{n} \) matrix $T$ the
\( n^2+1 \)-member set \( \set{I,T,T^2,\dots,T^{n^2} } \) is linearly
dependent and so there are scalars \( c_0,\dots,c_{n^2} \),
not all zero, such that
\begin{equation*}
  c_{n^2}T^{n^2}+\dots+c_1T+c_0I
\end{equation*}
is the zero matrix.
Therefore every transformation has a kind of generalized 
nilpotency:~the powers
of a square matrix cannot climb forever without a ``repeat.''

\begin{example}  \label{ex:PolySendRotMatToZ}
Rotation of plane vectors \( \pi/6 \)~radians counterclockwise is represented
with respect to the standard basis by
\begin{equation*}
  T=
  \begin{mat}[r]
     \sqrt{3}/2  &-1/2  \\
     1/2         &\sqrt{3}/2
  \end{mat}
\end{equation*}
and verifying that \( 0T^4+0T^3+1T^2-2T-1I \) equals the zero matrix is easy.
\end{example}

\begin{definition}
Let \( t \) be a linear transformation of a vector space~\( V \).
Where \( f(x)=c_nx^n+\dots+c_1x+c_0 \) is a polynomial,
\( f(t) \) is the
transformation \( c_nt^n+\dots+c_1t+c_0(\identity) \) on~\( V \).
In the same way, if \( T \) is a square matrix
then
\( f(T) \) is the matrix \( c_nT^n+\dots+c_1T+c_0I \).
\end{definition}

\noindent The polynomial of the matrix represents the polynomial of the map:~if 
\( T=\rep{t}{B,B} \) then \( f(T)=\rep{f(t)}{B,B} \).
This is because \( T^j=\rep{t^j}{B,B} \),
and \( cT=\rep{ct}{B,B} \), and \( T_1+T_2 =\rep{t_1+t_2}{B,B} \).

\begin{remark}
Most authors write the matrix polynomial slightly differently than the 
map polynomial. 
For instance, if  \( f(x)=x-3 \) then 
most authors explicitly write the identity matrix~\( f(T)=T-3I \)
but don't write the identity map~\( f(t)=t-3 \).
We shall follow this convention.
\end{remark}

Consider again \nearbyexample{ex:PolySendRotMatToZ}.
The space $\matspace_{\nbyn{2}}$ has dimension four so we know that for any
\( \nbyn{2} \) matrix there is a fourth degree polynomial \( f  \) such that
\( f(T) \) equals the zero matrix. 
But for the \( T \) in that example  
we exhibited a polynomial of degree less than four that gives the zero matrix. 
So while
degree~$n^2$ always suffices, in some cases  
a smaller-degree polynomial works.

\begin{definition}
The \definend{minimal polynomial}\index{polynomial!minimal}%
\index{minimal polynomial}
\( m(x) \) of a transformation \( t \)\index{transformation!minimal polynomial}
or a square matrix \( T \)\index{matrix!minimal polynomial} is the
polynomial of least degree and with leading coefficient one
such that \( m(t) \) is the zero map or \( m(T) \) is the zero matrix.
\end{definition}

\noindent A minimal
polynomial cannot be the zero polynomial because 
of the restriction on the leading coefficient.
Obviously no other constant polynomial would do, so a minimal
polynomial must have degree at least one.
Thus, the zero matrix has minimal polynomial $p(x)=x$ while the 
identity matrix has minimal polynomial $\hat{p}(x)=x-1$.

\begin{lemma}
Any transformation or square matrix has a unique minimal polynomial.  
\end{lemma}

\begin{proof}
We first prove existence.
By the earlier observation 
that degree~$n^2$ suffices, there is at least one 
polynomial $p(x)=c_kx^k+\cdots+c_0$ that
takes the map or matrix to zero, and 
it is not the zero polynomial by the prior paragraph.
From among all such polynomials
there must be at least one with minimal degree.
Divide this polynomial by its leading coefficient~$c_k$ to get a leading~$1$.
Hence any map or matrix has a minimal polynomial.

Now for uniqueness.
Suppose that 
\( m(x) \) and \( \hat{m}(x) \) both take the map or matrix to zero,
are both of 
minimal degree and are thus of equal degree, 
and both have a leading~$1$.
Subtract: \( d(x)=m(x)-\hat{m}(x) \).
This polynomial takes the map or matrix to zero
and since the leading terms of $m$ and~$\hat{m}$ cancel,
$d$ is of smaller degree than the other two.
If $d$ were to have a nonzero leading coefficient then we could divide
by it to get a polynomial that takes the map or matrix to zero and
has leading coefficient~$1$.
This would contradict the minimality of the degree
of $m$ and $\hat{m}$. 
Thus the leading coefficient of $d$ is zero,
so \( m(x)-\hat{m}(x) \) is the zero polynomial, 
and so the two are equal.
\end{proof}

\begin{example}  \label{ex:MinPolyForRotMat}
We can compute that \( m(x)=x^2-2x-1 \) is minimal for the matrix of
\nearbyexample{ex:PolySendRotMatToZ} by finding the powers of $T$
up to $n^2=4$.
\begin{equation*}
   T^2=
   \begin{mat}[r]
      1/2         &-\sqrt{3}/2  \\
      \sqrt{3}/2  &1/2
   \end{mat} 
   \quad
   T^3=
   \begin{mat}[r]
      0           &-1           \\
      1           &0
   \end{mat}
   \quad
   T^4=
   \begin{mat}[r]
      -1/2        &-\sqrt{3}/2  \\
      \sqrt{3}/2  &-1/2
   \end{mat}
\end{equation*}
Put \( c_4T^4+c_3T^3+c_2T^2+c_1T+c_0I \) equal to the zero matrix
\begin{equation*}
  \begin{linsys}{5}
     -(1/2)c_4  &  &             &+ &(1/2)c_2
         &+ &(\sqrt{3}/2)c_1  &+  &c_0  &=  &0      \\
     -(\sqrt{3}/2)c_4  &- &c_3 &- &(\sqrt{3}/2)c_2
         &- &(1/2)c_1  &   &          &=  &0        \\
      (\sqrt{3}/2)c_4  &+ &c_3 &+ &(\sqrt{3}/2)c_2
         &+ &(1/2)c_1  &   &            &=  &0      \\
     -(1/2)c_4  &  &             &+ &(1/2)c_2
         &+ &(\sqrt{3}/2)c_1  &+  &c_0  &=  &0
   \end{linsys}
\end{equation*}
and use Gauss' Method.
\begin{equation*}
  \begin{linsys}{5}
     c_4  &  &             &- &c_2
         &- &\sqrt{3}c_1  &-  &2c_0  &=  &0      \\
                           &  &c_3 &+ &\sqrt{3}c_2
         &+ &2c_1  &+  &\sqrt{3}c_0 &=  &0
   \end{linsys} 
\end{equation*}
Setting \( c_4 \), \( c_3 \), and \( c_2 \) to zero forces \( c_1 \) and
\( c_0 \) to also come out as zero.
To get a leading one, the most we can do is to set \( c_4 \) and \( c_3 \) to
zero.
Thus the minimal polynomial is quadratic.
\end{example}

Using the method of that example to find the minimal polynomial of a
\( \nbyn{3} \) matrix 
would mean doing Gaussian reduction on
a system with nine equations in ten unknowns.
We shall develop an alternative.
% To begin, note that we can break a polynomial of a map or a matrix into
% its components.
% (For this lemma, recall that we are using complex numbers in this chapter
% so all polynomials break completely into linear factors.)

\begin{lemma} \label{le:PolyMapsFactor}
Suppose that the polynomial \( f(x)=c_nx^n+\dots+c_1x+c_0 \) factors as
\( k(x-\lambda_1)^{q_1}\cdots(x-\lambda_z)^{q_z} \).
If \( t \) is a linear transformation then these two are equal maps. 
\begin{equation*}
  c_nt^n+\dots+c_1t+c_0
  =
  k\cdot\composed{\composed{(t-\lambda_1)^{q_1}}{\cdots}}{
      (t-\lambda_z)^{q_z}} 
\end{equation*}
Consequently, if \( T \) is a square matrix then \( f(T) \) and
\( k\cdot(T-\lambda_1I)^{q_1}\cdots(T-\lambda_z I)^{q_z} \) 
are equal matrices.
\end{lemma}

\begin{proof}
We use induction on the degree of the polynomial.
The cases where the polynomial is of
degree~zero and degree~one are clear.
The full induction argument is \nearbyexercise{le:PolyMapsFactor}
but we will give its sense with the degree~two case.

A quadratic polynomial factors into two
linear terms \( f(x)=k(x-\lambda_1)\cdot(x-\lambda_2)
                    =k(x^2+(-\lambda_1-\lambda_2)x+\lambda_1\lambda_2) \)
(the roots $\lambda_1$ and $\lambda_2$ could be equal).
We can check that substituting \( t \) 
for \( x \) in the factored and
unfactored versions gives the same map.
\begin{align*}
   \bigl(k\cdot\composed{(t-\lambda_1)}{(t-\lambda_2)}\bigr)\>(\vec{v})
   &=\bigl(k\cdot(t-\lambda_1)\bigr)\,(t(\vec{v})-\lambda_2\vec{v})    \\
   &=k\cdot\bigl(t(t(\vec{v}))-t(\lambda_2\vec{v})
      -\lambda_1 t(\vec{v})-\lambda_1\lambda_2\vec{v}\bigr)    \\
   &=k\cdot \bigl(\composed{t}{t}\,(\vec{v})-(\lambda_1+\lambda_2)t(\vec{v})
          +\lambda_1\lambda_2\vec{v}\bigr)                    \\
   &=k\cdot(t^2-(\lambda_1+\lambda_2)t+\lambda_1\lambda_2)\>(\vec{v})
\end{align*}
The third equality holds because the scalar $\lambda_2$  comes out of the
second term, since \( t \) is linear.
\end{proof}

In particular, if a minimal polynomial $m(x)$ for a transformation $t$ 
factors as
$m(x)=(x-\lambda_1)^{q_1}\cdots (x-\lambda_z)^{q_z}$
then 
\( m(t)=\composed{\composed{(t-\lambda_1)^{q_1}}{\cdots}}{
      (t-\lambda_z)^{q_z}} \) 
is the zero map. 
Since \( m(t) \) sends every vector to zero, at least
one of the maps \( t-\lambda_i \)  sends some
nonzero vectors to zero.
Exactly the same holds in the matrix case\Dash if $m$ is minimal for $T$ then
\( m(T)=(T-\lambda_1I)^{q_1}\cdots (T-\lambda_z I)^{q_z} \)
is the zero matrix and at least one of the matrices $T-\lambda_iI$
sends some nonzero vectors to zero. 
That is, in both cases at least some of the \( \lambda_i \) are eigenvalues.
(\nearbyexercise{exer:SomeRootsMinPolyAreEigs} expands on this.)

The next result is that
every root of the minimal polynomial is an eigenvalue, and further
that every eigenvalue is a root of the minimal polynomial 
(i.e, below it says `$1\leq q_i$' and 
not just `$0\leq q_i$').
For that result, recall that to find eigenvalues
we solve $\deter{T-xI}=0$ and 
this determinant gives a polynomial in $x$, 
called the characteristic polynomial, 
whose roots are the eigenvalues.

\begin{theorem}[Cayley-Hamilton]
\label{th:CayHam}
\index{Cayley-Hamilton theorem}
\hspace*{0em plus2em}
If the characteristic polynomial of a transformation or square matrix
factors into
\begin{equation*}
  k\cdot (x-\lambda_1)^{p_1}(x-\lambda_2)^{p_2}\cdots(x-\lambda_z)^{p_z}
\end{equation*}
then its minimal polynomial factors into
\begin{equation*}
  (x-\lambda_1)^{q_1}(x-\lambda_2)^{q_2}\cdots(x-\lambda_z)^{q_z}
\end{equation*}
where \( 1\leq q_i \leq p_i \) for each \( i \) between \( 1 \) and \( z \).
\end{theorem}

\noindent The proof takes up the next three lemmas.
We will state them in matrix terms but they apply equally
well to maps.
(The matrix version is convenient 
for the first proof.)

The first result is the key.
For the proof, observe that we can view
a matrix of polynomials as a polynomial with
matrix coefficients.
\begin{equation*}
   \begin{mat}
     2x^2+3x-1  &x^2+2    \\
     3x^2+4x+1  &4x^2+x+1
   \end{mat}
 = \begin{mat}[r]
    2  &1  \\
    3  &4
  \end{mat}x^2
 + \begin{mat}[r]
    3  &0  \\
    4  &1
  \end{mat}x
 + \begin{mat}[r]
   -1  &2  \\
    1  &1
  \end{mat}
\end{equation*}

\begin{lemma}   \label{le:MatSatItsCharPoly}
If \( T \) is a square matrix with characteristic polynomial \( c(x) \)
then \( c(T) \) is the zero matrix.
\end{lemma}

\begin{proof}
Let \( C \) be \( T-xI \),
the matrix whose determinant is the characteristic polynomial
\( c(x)=c_nx^n+\dots+c_1x+c_0 \).
\begin{equation*}
  C=\begin{mat}
    t_{1,1}-x        &t_{1,2}   &\ldots        \\
    t_{2,1}          &t_{2,2}-x               \\
    \vdots           &          &\ddots       \\
                     &          &       &t_{n,n}-x
  \end{mat}
\end{equation*}
Recall Theorem~Four.III.\ref{th:MatTimesAdjEqDiagDets},
that the product of a matrix with its adjoint equals
the determinant of the matrix times the identity.
\begin{equation*}
  c(x)\cdot I
  =\adj (C)C
  =\adj (C)(T-xI)
  =\adj (C)T- \adj(C)\cdot x
\tag*{($*$)}
\end{equation*}
The left side of~($*$) is 
$c_nIx^n+c_{n-1}Ix^{n-1}+\dots+c_1Ix+c_0I$.
For the right side,
the entries of \( \adj (C) \) are polynomials, each of degree
at most \( n-1 \) since the minors of a matrix drop a row and column.
As suggested before the proof, rewrite it as a polynomial with
matrix coefficients:
\( \adj (C)=C_{n-1}x^{n-1}+\dots+C_1x+C_0 \)
where each \( C_i \) is a matrix of scalars.
Now this is the right side of~($*$).
\begin{equation*}
  [(C_{n-1}T)x^{n-1}+\dots+(C_1T)x+C_0T]  
   -[C_{n-1}x^n-C_{n-2}x^{n-1}-\dots-C_0x]
\end{equation*}
Equate the left and right side of ($*$)'s
coefficients of \( x^n \), of $x^{n-1}$, etc.
\begin{align*}
  c_nI
  &=-C_{n-1}    \\
  c_{n-1}I
  &=-C_{n-2}+C_{n-1}T    \\
  &\vdotswithin{=}             \\
  c_{1}I
  &=-C_{0}+C_{1}T    \\
  c_{0}I
  &=C_{0}T
\end{align*}
Multiply, from the right, both sides of the first equation by \( T^n \), 
both sides of the second equation by \( T^{n-1} \), etc.
\begin{align*}
  c_nT^n
  &=-C_{n-1}T^n    \\
  c_{n-1}T^{n-1}
  &=-C_{n-2}T^{n-1}+C_{n-1}T^n    \\
  &\vdotswithin{=}             \\
  c_{1}T
  &=-C_{0}T+C_{1}T^2    \\
  c_{0}I
  &=C_{0}T
\end{align*}
Add.
The left is 
\( c_nT^n+c_{n-1}T^{n-1}+\dots+c_0I \). 
The right telescopes;
for instance $-C_{n-1}T^n$ from the first line combines with the 
$C_{n-1}T^n$ half of the second line. 
The total on the right is the zero matrix.
\end{proof}

We refer to that result by saying that a
matrix or map 
\definend{satisfies}\index{characteristic!polynomial!satisfied by} 
its characteristic polynomial.

\begin{lemma} \label{le:tSatisImpMinPolyDivides}
Where \( f(x) \) is a polynomial, if \( f(T) \) is the zero matrix 
then \( f(x) \) is divisible by the minimal polynomial of \( T \).
That is, any polynomial that is satisfied by \( T \) is divisible by
\( T \)'s minimal polynomial.
\end{lemma}

\begin{proof}
Let \( m(x) \) be minimal for \( T \).
The Division Theorem for Polynomials gives
\( f(x)=q(x)m(x)+r(x) \)
where the degree of \( r \) is strictly less than the degree of \( m \).
Because $T$ satisfies both $f$ and $m$, plugging $T$ into that equation gives
that \( r(T) \) is the zero matrix.
That contradicts the minimality of \( m \) unless \( r \)
is the zero polynomial.
\end{proof}

Combining the prior two lemmas shows that the minimal polynomial 
divides the characteristic polynomial. 
Thus
any root of the minimal polynomial is also a root of the characteristic
polynomial. 
That is, so far we have that if 
\( m(x)=(x-\lambda_1)^{q_1}\cdots(x-\lambda_i)^{q_i} \) then
\( c(x) \) has the form
\( (x-\lambda_1)^{p_1}\cdots(x-\lambda_i)^{p_i}
     (x-\lambda_{i+1})^{p_{i+1}}\cdots(x-\lambda_z)^{p_z} \) where
each \( q_j \) is less than or equal to \( p_j \).
We finish the proof of the Cayley-Hamilton Theorem by showing that 
the characteristic polynomial has no additional roots, that is,
there are no $\lambda_{i+1}$, $\lambda_{i+2}$, etc.

\begin{lemma}
Each linear factor of the characteristic polynomial of a square matrix
is also a linear factor of the minimal polynomial.
\end{lemma}

\begin{proof}
Let \( T \) be a square matrix with minimal polynomial \( m(x) \) and 
assume that \( x-\lambda \) is a factor of the characteristic polynomial of 
\( T \), that \( \lambda \) is an eigenvalue of \( T \).
We must show that $x-\lambda$ is a factor of $m$, i.e., that 
$m(\lambda)=0$.

Suppose that $\lambda$ is an eigenvalue of $T$ with associated 
eigenvector~$\vec{v}$.
Then
$T\cdot T\vec{v}=T\cdot\lambda\vec{v}=\lambda T\vec{v}=\lambda^2\vec{v}$.
Similarly, $T^n\vec{v}=\lambda^n\vec{v}$.
With that, we have that 
for any polynomial function \( p(x) \), application of the matrix \( p(T) \)
to \( \vec{v} \) equals the result of multiplying \( \vec{v} \) by the scalar
\( p(\lambda) \).
\begin{multline*}
  p(T)\cdot\vec{v}
  =(c_kT^k+\dots+c_1T+c_0I)\cdot\vec{v}
  =c_kT^k\vec{v}+\dots+c_1T\vec{v}+c_0\vec{v}  \\
  =c_k\lambda^k\vec{v}+\dots+c_1\lambda\vec{v}+c_0\vec{v}
  =p(\lambda)\cdot\vec{v}
\end{multline*}
Since \( m(T) \) is the zero matrix,
\( \zero=m(T)(\vec{v})=m(\lambda)\cdot\vec{v} \)
for all $\vec{v}$, and
hence \( m(\lambda)=0 \).
\end{proof}

That concludes the proof of the Cayley-Hamilton Theorem.

\begin{example} \label{ex:MinPolyUsingCH}
We can use the Cayley-Hamilton Theorem to find the minimal polynomial of
this matrix.
\begin{equation*}
   T=
   \begin{mat}[r]
      2  &0  &0  &1  \\
      1  &2  &0  &2  \\
      0  &0  &2  &-1 \\
      0  &0  &0  &1
   \end{mat}
\end{equation*}
First we find its characteristic polynomial \( c(x)=(x-1)(x-2)^3 \)
with the usual determinant.
Now, the Cayley-Hamilton Theorem says that 
\( T \)'s minimal polynomial is either
\( (x-1)(x-2) \) or
\( (x-1)(x-2)^2 \) or
\( (x-1)(x-2)^3 \).
We can decide among the choices just by computing
\begin{equation*}
   (T-1I)(T-2I)=\!
   \begin{mat}[r]
      1  &0  &0  &1  \\
      1  &1  &0  &2  \\
      0  &0  &1  &-1 \\
      0  &0  &0  &0
   \end{mat}
   \begin{mat}[r]
      0  &0  &0  &1  \\
      1  &0  &0  &2  \\
      0  &0  &0  &-1 \\
      0  &0  &0  &-1
   \end{mat}
   =
   \begin{mat}[r]
      0  &0  &0  &0  \\
      1  &0  &0  &1  \\
      0  &0  &0  &0  \\
      0  &0  &0  &0
   \end{mat}
\end{equation*}
and
\begin{equation*}
   (T-1I)(T-2I)^2=
   \begin{mat}[r]
      0  &0  &0  &0  \\
      1  &0  &0  &1  \\
      0  &0  &0  &0  \\
      0  &0  &0  &0
   \end{mat}
   \begin{mat}[r]
      0  &0  &0  &1  \\
      1  &0  &0  &2  \\
      0  &0  &0  &-1 \\
      0  &0  &0  &-1
   \end{mat}
   =
   \begin{mat}[r]
      0  &0  &0  &0  \\
      0  &0  &0  &0  \\
      0  &0  &0  &0  \\
      0  &0  &0  &0
   \end{mat}
\end{equation*}
and so \( m(x)=(x-1)(x-2)^2 \).
\end{example}


\begin{exercises}
  \recommended \item 
    What are the possible minimal polynomials if a matrix has
    the given characteristic polynomial?
    \begin{exparts*}
      \partsitem $(x-3)^4$
      \partsitem $(x+1)^3(x-4)$
      \partsitem $(x-2)^2(x-5)^2$
      \partsitem  \( (x+3)^2(x-1)(x-2)^2 \)
    \end{exparts*}
    What is the degree of each possibility?
    \begin{answer} 
      The Cayley-Hamilton Theorem  \nearbytheorem{th:CayHam} says that
      the minimal polynomial must contain the same linear factors
      as the characteristic polynomial, although possibly of lower degree
      but not of zero degree.
      \begin{exparts}
        \partsitem The possibilities are 
          $m_1(x)=x-3$, $m_2(x)=(x-3)^2$, $m_3(x)=(x-3)^3$,
          and $m_4(x)=(x-3)^4$.
          The first is a degree one polynomial, the second is degree two,
          the third is degree three, and the fourth is degree four.
        \partsitem The possibilities are $m_1(x)=(x+1)(x-4)$,
          $m_2(x)=(x+1)^2(x-4)$, and $m_3(x)=(x+1)^3(x-4)$.
          The first is a quadratic polynomial, that is, it has degree two.
          The second has degree three, and the third has degree four.
        \partsitem We have $m_1(x)=(x-2)(x-5)$, $m_2(x)=(x-2)^2(x-5)$,
          $m_3(x)=(x-2)(x-5)^2$, and $m_4(x)=(x-2)^2(x-5)^2$.
          They are polynomials of degree two, three, three, and four.
        \partsitem The possibilities are \( m_1(x)=(x+3)(x-1)(x-2) \),
          \( m_2(x)=(x+3)^2(x-1)(x-2) \),
          \( m_3(x)=(x+3)(x-1)(x-2)^2 \),
          and \( m_4(x)=(x+3)^2(x-1)(x-2)^2 \).
          The degree of $m_1$ is three, the degree of $m_2$ is four,
          the degree of $m_3$ is four, and the degree of $m_4$ is five.
      \end{exparts}
    \end{answer}
  \recommended \item 
    Find the minimal polynomial of each matrix.
    \begin{exparts*}
       \partsitem \( \begin{mat}[r]
                   3  &0  &0  \\
                   1  &3  &0  \\
                   0  &0  &4
                \end{mat} \)
       \partsitem \( \begin{mat}[r]
                   3  &0  &0  \\
                   1  &3  &0  \\
                   0  &0  &3
                \end{mat} \)
       \partsitem \( \begin{mat}[r]
                   3  &0  &0  \\
                   1  &3  &0  \\
                   0  &1  &3
                \end{mat} \)
       \partsitem \( \begin{mat}[r]
                   2  &0  &1  \\
                   0  &6  &2  \\
                   0  &0  &2
                \end{mat} \)
       \partsitem \( \begin{mat}[r]
                   2  &2  &1  \\
                   0  &6  &2  \\
                   0  &0  &2
                \end{mat} \)
       \partsitem \( \begin{mat}[r]
                   -1 &4  &0  &0  &0  \\
                    0 &3  &0  &0  &0  \\
                    0 &-4 &-1 &0  &0  \\
                    3 &-9 &-4 &2  &-1 \\
                    1 &5  &4  &1  &4
                \end{mat} \)
    \end{exparts*}
    \begin{answer}
      In each case we will use the method of \nearbyexample{ex:MinPolyUsingCH}.
      \begin{exparts}
       \partsitem Because $T$ is triangular, $T-xI$ is also triangular
         \begin{equation*}
           T-xI=
           \begin{mat}
             3-x  &0    &0   \\
             1    &3-x  &0   \\
             0    &0    &4-x
           \end{mat}
         \end{equation*}
         the characteristic polynomial is
         easy $c(x)=\deter{T-xI}=(3-x)^2(4-x)=-1\cdot (x-3)^2(x-4)$.
         There are only two possibilities for the minimal polynomial,
         $m_1(x)=(x-3)(x-4)$ and $m_2(x)=(x-3)^2(x-4)$.
         (Note that the characteristic polynomial has a negative sign
         but the minimal polynomial does not since it must
         have a leading coefficient of one).
         Because $m_1(T)$ is not the zero matrix
         \begin{equation*}
           (T-3I)(T-4I)
           =
           \begin{mat}[r]
             0  &0  &0  \\
             1  &0  &0  \\
             0  &0  &1
           \end{mat}
           \begin{mat}[r]
             -1  &0  &0  \\
              1  &-1 &0  \\
              0  &0  &0
           \end{mat}
           =
           \begin{mat}[r]
             0  &0  &0  \\
            -1  &0  &0  \\
             0  &0  &0
           \end{mat}
         \end{equation*}
         the minimal polynomial is $m(x)=m_2(x)$.
         \begin{multline*}
           (T-3I)^2(T-4I)
           =(T-3I)\cdot\bigl((T-3I)(T-4I)\bigr)              \\
           =
           \begin{mat}
             0  &0  &0  \\
             1  &0  &0  \\
             0  &0  &1
           \end{mat}
           \begin{mat}[r]
              0  &0  &0  \\
             -1  &0  &0  \\
              0  &0  &0
           \end{mat}
           =
           \begin{mat}[r]
             0  &0  &0  \\
             0  &0  &0  \\
             0  &0  &0
           \end{mat}
         \end{multline*}
       \partsitem As in the prior item, the fact that the matrix is 
        triangular makes computation of the characteristic polynomial
        easy.
        \begin{equation*}
          c(x)=\deter{T-xI}
              =
              \begin{vmat}
                3-x  &0   &0   \\
                1    &3-x &0   \\
                0    &0   &3-x
              \end{vmat}
              =(3-x)^3=-1\cdot (x-3)^3
        \end{equation*}
        There are three possibilities for the minimal polynomial
        $m_1(x)=(x-3)$, $m_2(x)=(x-3)^2$, and $m_3(x)=(x-3)^3$.
        We settle the question by computing $m_1(T)$
        \begin{equation*}
          T-3I=
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &0  &0
          \end{mat}
        \end{equation*}
        and $m_2(T)$.
        \begin{equation*}
          (T-3I)^2=
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &0  &0
          \end{mat}
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &0  &0
          \end{mat}
          =          
          \begin{mat}[r]
            0  &0  &0  \\
            0  &0  &0  \\
            0  &0  &0
          \end{mat}
        \end{equation*}
        Because $m_2(T)$ is the zero matrix, $m_2(x)$ is the minimal
        polynomial.
       \partsitem Again, the matrix is triangular.
        \begin{equation*}
          c(x)=\deter{T-xI}
              =
              \begin{vmat}
                3-x  &0   &0   \\
                1    &3-x &0   \\
                0    &1   &3-x
              \end{vmat}
              =(3-x)^3=-1\cdot (x-3)^3
        \end{equation*}
        Again, there are three possibilities for the minimal polynomial
        $m_1(x)=(x-3)$, $m_2(x)=(x-3)^2$, and $m_3(x)=(x-3)^3$.
        We compute $m_1(T)$
        \begin{equation*}
          T-3I=
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &1  &0
          \end{mat}
        \end{equation*}
        and $m_2(T)$
        \begin{equation*}
          (T-3I)^2=
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &1  &0
          \end{mat}
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &1  &0
          \end{mat}
          =          
          \begin{mat}[r]
            0  &0  &0  \\
            0  &0  &0  \\
            1  &0  &0
          \end{mat}
        \end{equation*}
        and $m_3(T)$.
        \begin{equation*}
          (T-3I)^3
          =(T-3I)^2(T-3I)
          =
          \begin{mat}[r]
            0  &0  &0  \\
            0  &0  &0  \\
            1  &0  &0
          \end{mat}
          \begin{mat}[r]
            0  &0  &0  \\
            1  &0  &0  \\
            0  &1  &0
          \end{mat}
          =          
          \begin{mat}[r]
            0  &0  &0  \\
            0  &0  &0  \\
            0  &0  &0
          \end{mat}
        \end{equation*}
        Therefore, the minimal polynomial is $m(x)=m_3(x)=(x-3)^3$.
       \partsitem This case is also triangular, here upper triangular.
         \begin{equation*}
           c(x)=\deter{T-xI}=
           \begin{vmat}
             2-x  &0   &1     \\
             0    &6-x &2     \\
             0    &0   &2-x
           \end{vmat}
           =(2-x)^2(6-x)=-(x-2)^2(x-6)
         \end{equation*}
         There are two possibilities for the minimal polynomial,
         $m_1(x)=(x-2)(x-6)$ and $m_2(x)=(x-2)^2(x-6)$.
         Computation shows that the minimal polynomial isn't $m_1(x)$.
         \begin{equation*}
           (T-2I)(T-6I)=
           \begin{mat}[r]
             0  &0  &1  \\
             0  &4  &2  \\
             0  &0  &0  
           \end{mat}
           \begin{mat}[r]
             -4  &0  &1  \\
              0  &0  &2  \\
              0  &0  &-4
           \end{mat}
           =
           \begin{mat}[r]
             0  &0  &-4  \\
             0  &0  &0   \\
             0  &0  &0
           \end{mat}
         \end{equation*}
         It therefore must be that $m(x)=m_2(x)=(x-2)^2(x-6)$. 
         Here is a verification.
         \begin{multline*}
           (T-2I)^2(T-6I)=(T-2I)\cdot\bigl((T-2I)(T-6I)\bigr)             \\
           =
           \begin{mat}[r]
             0  &0  &1  \\
             0  &4  &2  \\
             0  &0  &0  
           \end{mat}
           \begin{mat}[r]
              0  &0  &-4   \\
              0  &0  &0   \\
              0  &0  &0
           \end{mat}
           =
           \begin{mat}[r]
             0  &0  &0  \\
             0  &0  &0   \\
             0  &0  &0
           \end{mat}
         \end{multline*}
       \partsitem The characteristic polynomial is 
         \begin{equation*}
           c(x)=\deter{T-xI}=
           \begin{vmat}
             2-x  &2   &1     \\
             0    &6-x &2     \\
             0    &0   &2-x
           \end{vmat}
           =(2-x)^2(6-x)=-(x-2)^2(x-6)
         \end{equation*}
         and there are two possibilities for the minimal polynomial,
         $m_1(x)=(x-2)(x-6)$ and $m_2(x)=(x-2)^2(x-6)$.
         Checking the first one
         \begin{equation*}
           (T-2I)(T-6I)=
           \begin{mat}[r]
             0  &2  &1  \\
             0  &4  &2  \\
             0  &0  &0  
           \end{mat}
           \begin{mat}[r]
             -4  &2  &1  \\
              0  &0  &2  \\
              0  &0  &-4
           \end{mat}
           =
           \begin{mat}[r]
             0  &0  &0  \\
             0  &0  &0   \\
             0  &0  &0
           \end{mat}
         \end{equation*}
         shows that the minimal polynomial is
         $m(x)=m_1(x)=(x-2)(x-6)$.
       \partsitem The characteristic polynomial is this.
         \begin{equation*}
           c(x)=\deter{T-xI}=
           \begin{vmat} 
              -1-x &4    &0    &0    &0    \\
               0   &3-x  &0    &0    &0    \\
               0   &-4   &-1-x &0    &0    \\
               3   &-9   &-4   &2-x  &-1   \\
               1   &5    &4    &1    &4-x
           \end{vmat}     
           =(x-3)^3(x+1)^2
         \end{equation*}
         Here are the possibilities for the minimal polynomial,
         listed here by ascending degree:
         $m_1(x)=(x-3)(x+1)$, $m_1(x)=(x-3)^2(x+1)$, $m_1(x)=(x-3)(x+1)^2$, 
         $m_1(x)=(x-3)^3(x+1)$, $m_1(x)=(x-3)^2(x+1)^2$, 
         and $m_1(x)=(x-3)^3(x+1)^2$. 
         The first one doesn't pan out
         \begin{align*}
           (T-3I)(T+1I)
           &=
           \begin{mat}[r] 
              -4   &4    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &-4   &-4   &0    &0    \\
               3   &-9   &-4   &-1   &-1   \\
               1   &5    &4    &1    &1  
           \end{mat}     
           \begin{mat}[r] 
               0   &4    &0    &0    &0    \\
               0   &4    &0    &0    &0    \\
               0   &-4   &0    &0    &0    \\
               3   &-9   &-4   &3    &-1   \\
               1   &5    &4    &1    &5  
           \end{mat}                           \\     
           &=
           \begin{mat}[r] 
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
              -4   &-4   &0    &-4   &-4   \\
               4   &4    &0    &4    &4  
           \end{mat}           
         \end{align*}
         but the second one does.
         \begin{multline*}
           (T-3I)^2(T+1I)=(T-3I)\bigl((T-3I)(T+1I)\bigr) \\
           \begin{aligned}
           &=
           \begin{mat}[r] 
              -4   &4    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &-4   &-4   &0    &0    \\
               3   &-9   &-4   &-1   &-1   \\
               1   &5    &4    &1    &1  
           \end{mat}     
           \begin{mat}[r] 
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
              -4   &-4   &0    &-4   &-4   \\
               4   &4    &0    &4    &4  
           \end{mat}                          \\          
           &=
           \begin{mat}[r] 
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0    \\
               0   &0    &0    &0    &0  
           \end{mat}  
           \end{aligned}         
         \end{multline*}
         The minimal polynomial is \( m(x)=(x-3)^2(x+1) \).
      \end{exparts} 
    \end{answer}
   \item 
     Find the minimal polynomial of this matrix.
     \begin{equation*}
        \begin{mat}[r]
           0  &1  &0  \\
           0  &0  &1  \\
           1  &0  &0
        \end{mat}
     \end{equation*}
     \begin{answer}
       Its characteristic polynomial has complex roots.
       \begin{equation*}
          \begin{vmat}
                   -x  &1  &0  \\
                    0  &-x &1  \\
                    1  &0  &-x
          \end{vmat}
          =(1-x)\cdot (x-(-\frac{1}{2}+\frac{\sqrt{3}}{2}i))
                \cdot (x-(-\frac{1}{2}-\frac{\sqrt{3}}{2}i))
       \end{equation*}
       As the roots are distinct, the characteristic polynomial equals the
       minimal polynomial. 
     \end{answer}
  \recommended \item 
     What is the minimal polynomial of the differentiation
     operator $d/dx$ on \( \polyspace_n \)?
     \begin{answer}
       We know that $\polyspace_n$ is a dimension $n+1$ space and that
       the differentiation operator is
       nilpotent of index~$n+1$ (for instance, taking $n=3$, 
       $\polyspace_3=\set{c_3x^3+c_2x^2+c_1x+c_0\suchthat c_3,\ldots,c_0\in\C}$
       and the fourth derivative of a cubic is the zero polynomial).  
       Represent this operator using the canonical 
       form for nilpotent transformations.
       \begin{equation*}
         \begin{mat}
           0  &0  &0  &\ldots &  &0  \\
           1  &0  &0  &       &  &0  \\
           0  &1  &0  &       &  &   \\
              &   &\ddots            \\
           0  &0  &0  &       &1 &0 
         \end{mat}
       \end{equation*}
       This is an $\nbyn{(n+1)}$ matrix with an easy 
       characteristic polynomial,
       $c(x)=x^{n+1}$.
       (\textit{Remark:} this matrix is $\rep{d/dx}{B,B}$ where
        $B=\sequence{x^n,nx^{n-1},n(n-1)x^{n-2},\ldots,n!}$.)
       To find the minimal polynomial as in \nearbyexample{ex:MinPolyUsingCH}
       we consider the powers of $T-0I=T$.
       But, of course, the first power of $T$ that is the zero matrix is 
       the power $n+1$.
       So the minimal polynomial is also \( x^{n+1} \).
     \end{answer}
  \recommended \item 
    Find the minimal polynomial of matrices of this form
    \begin{equation*}
      \begin{mat}
        \lambda  &0        &0          &\ldots  &        &0  \\
        1        &\lambda  &0          &        &        &0  \\
        0        &1        &\lambda                          \\
                 &         &           &\ddots                \\
                 &         &           &        &\lambda &0   \\
        0        &0        &\ldots     &        &1       &\lambda
      \end{mat}
    \end{equation*}
    where the scalar $\lambda$ is fixed (i.e., is not a variable).
    \begin{answer}
      Call the matrix $T$ and suppose that it is \( \nbyn{n} \).
      Because $T$ is triangular, and so $T-xI$ is triangular,
      the characteristic polynomial is $c(x)=(x-\lambda)^n$.
      To see that the minimal polynomial is the same, consider
      $T-\lambda I$.
      \begin{equation*}
        \begin{mat}
          0        &0        &0          &\ldots  &0  \\
          1        &0        &0          &\ldots  &0  \\
          0        &1        &0                       \\
                   &         &\ddots                  \\
          0        &0        &\ldots     &1       &0      
        \end{mat}
      \end{equation*}
      Recognize it as the canonical form for a transformation that is 
      nilpotent of degree~$n$; the power $(T-\lambda I)^j$ is zero first
      when $j$ is $n$.
    \end{answer}
  \item 
    What is the minimal polynomial of the transformation of
    \( \polyspace_n \) that sends \( p(x) \) to \( p(x+1) \)?
    \begin{answer}
      The $n=3$ case provides a hint.
      A natural basis for $\polyspace_3$ is  
      $B=\sequence{1,x,x^2,x^3}$.
      The action of the transformation is
      \begin{equation*}
        1\mapsto 1
        \quad
        x\mapsto x+1
        \quad
        x^2\mapsto x^2+2x+1
        \quad
        x^3\mapsto x^3+3x^2+3x+1
      \end{equation*}
      and so the representation $\rep{t}{B,B}$ is this upper triangular matrix.
      \begin{equation*}
        \begin{mat}[r]
          1  &1  &1  &1  \\
          0  &1  &2  &3  \\
          0  &0  &1  &3  \\
          0  &0  &0  &1
        \end{mat}
      \end{equation*}
      Because it is triangular, the fact that the characteristic polynomial is
      $c(x)=(x-1)^4$ is clear.
      For the minimal polynomial, the candidates are $m_1(x)=(x-1)$,
      \begin{equation*}
        T-1I=
        \begin{mat}[r]
          0  &1  &1  &1  \\
          0  &0  &2  &3  \\
          0  &0  &0  &3  \\
          0  &0  &0  &0
        \end{mat}
      \end{equation*}
      $m_2(x)=(x-1)^2$, 
      \begin{equation*}
        (T-1I)^2=
        \begin{mat}[r]
          0  &0  &2  &6  \\
          0  &0  &0  &6  \\
          0  &0  &0  &0  \\
          0  &0  &0  &0
        \end{mat}
      \end{equation*}
      $m_3(x)=(x-1)^3$,
      \begin{equation*}
        (T-1I)^3=
        \begin{mat}[r]
          0  &0  &0  &6  \\
          0  &0  &0  &0  \\
          0  &0  &0  &0  \\
          0  &0  &0  &0
        \end{mat}
      \end{equation*}
      and $m_4(x)=(x-1)^4$.
      Because $m_1$, $m_2$, and $m_3$ are not right, $m_4$ must be right,
      as is easily verified.
      
      In the case of a general $n$, the representation is an upper
      triangular matrix with ones on the diagonal.
      Thus the characteristic polynomial is $c(x)=(x-1)^{n+1}$.
      One way to verify that the minimal polynomial equals the 
      characteristic polynomial is argue something like this:
      say that an upper triangular matrix is $0$-upper triangular if
      there are nonzero entries on the diagonal, that it is $1$-upper 
      triangular if the diagonal contains only zeroes and there are nonzero
      entries just above the diagonal, etc.
      As the above example illustrates, an induction argument will 
      show that, where $T$ has only nonnegative entries, 
      $T^j$ is $j$-upper triangular.
      % We leave that argument to the reader.
    \end{answer}
   \item 
     What is the minimal polynomial of
     the map \( \map{\pi}{\C^3}{\C^3} \)
     projecting onto the first two coordinates?
      \begin{answer}
        The map twice is the same as the map once:~$\composed{\pi}{\pi}=\pi$,
        that is, $\pi^2=\pi$ and so the minimal polynomial is of degree
        at most two since \( m(x)=x^2-x \) will do.
        The fact that no linear polynomial will do follows from applying
        the maps on the left and right side of 
        $c_1\cdot \pi+c_0\cdot \identity=z$ (where $z$ is the zero map)
        to these two vectors.
        \begin{equation*}
          \colvec[r]{0 \\ 0 \\ 1}
          \qquad
          \colvec[r]{1 \\ 0 \\ 0}
        \end{equation*}
        Thus the minimal polynomial is $m$.
      \end{answer}
   \item 
     Find a \( \nbyn{3} \) matrix whose minimal
     polynomial is \( x^2 \).
     \begin{answer}
        This is one answer.
        \begin{equation*}
            \begin{mat}[r]
              0  &0  &0  \\
              1  &0  &0  \\
              0  &0  &0
            \end{mat}
        \end{equation*} 
      \end{answer}
  \item 
     What is wrong with this claimed proof of
     \nearbylemma{le:MatSatItsCharPoly}:
      ``if \( c(x)=\deter{T-xI} \) then \( c(T)=\deter{T-TI}=0 \)''?
     \cite{Cullen}
     \begin{answer}
       The \( x \) must be a scalar, not a matrix.
     \end{answer}
  \item 
    Verify \nearbylemma{le:MatSatItsCharPoly} for \( \nbyn{2} \)
    matrices by direct calculation.
    \begin{answer}
      The characteristic polynomial of
      \begin{equation*}
         T=\begin{mat}
              a  &b  \\
              c  &d
           \end{mat}
      \end{equation*}
      is \( (a-x)(d-x)-bc=x^2-(a+d)x+(ad-bc) \).
      Substitute
      \begin{multline*}
         \begin{mat}
              a  &b  \\
              c  &d
         \end{mat}^2
         -
         (a+d)\begin{mat}
            a  &b  \\
            c  &d
         \end{mat}
         +
         (ad-bc)\begin{mat}[r]
            1  &0  \\
            0  &1
         \end{mat}                            \\
         =                     
         \begin{mat}
            a^2+bc  &ab+bd  \\
            ac+cd   &bc+d^2
         \end{mat}
         -
         \begin{mat}
            a^2+ad  &ab+bd   \\
            ac+cd   &ad+d^2
         \end{mat}
         +
         \begin{mat}
            ad-bc  &0      \\
            0      &ad-bc
         \end{mat}
      \end{multline*}
      and just check each entry sum to see that the result is the zero matrix.
    \end{answer}
  \recommended \item
    Prove that the minimal polynomial of an \( \nbyn{n} \) matrix has
    degree at most \( n \) (not \( n^2 \) as a person might guess from this
    subsection's opening).
    Verify that this maximum, \( n \), can happen.
    \begin{answer}
      By the Cayley-Hamilton theorem the degree of the minimal polynomial is
      less than or equal to the degree of the characteristic polynomial,
      \( n \).
      \nearbyexample{ex:MinPolyForRotMat} shows that \( n \) can happen.
    \end{answer}
   \recommended \item 
     Show that, on a nontrivial vector space, a linear transformation is 
     nilpotent if and only if its only eigenvalue is zero.
     \begin{answer}
       Let the linear transformation be $\map{t}{V}{V}$.
       If $t$ is nilpotent then there is an $n$ such that $t^n$ is the zero map,
       so $t$ satisfies the polynomial $p(x)=x^n=(x-0)^n$.
       By \nearbylemma{le:tSatisImpMinPolyDivides} the minimal polynomial of 
       $t$ divides $p$, so the minimal polynomial has only zero for a root.
       By Cayley-Hamilton, \nearbytheorem{th:CayHam},
       the characteristic polynomial has only zero for a root.
       Thus the only eigenvalue of $t$ is zero.

       Conversely, if a transformation \( t \) on an
       \( n \)-dimensional space has only the single eigenvalue of zero 
       then its characteristic polynomial is \( x^n \). 
       The Cayley-Hamilton Theorem says that a map satisfies its
       characteristic polynomial so \( t^n \) is the zero map.
       Thus $t$ is nilpotent.
     \end{answer}
   \item 
       What is the minimal polynomial of a zero map or matrix?
       Of an identity map or matrix?
       \begin{answer}
         A minimal polynomial must have leading coefficient $1$, 
         and so if the minimal polynomial of a map or matrix were to 
         be a degree zero polynomial then it would be $m(x)=1$.
         But the identity map or matrix equals the zero map or matrix
         only on a trivial vector space.

         So in the nontrivial case the minimal polynomial must be of degree
         at least one.
         A zero map or matrix has minimal polynomial \( m(x)=x \), and an
         identity map or matrix has minimal polynomial \( m(x)=x-1 \). 
       \end{answer}
  \recommended \item 
     Interpret the minimal polynomial of 
     \nearbyexample{ex:PolySendRotMatToZ} geometrically.
     \begin{answer}
       We can interpret the polynomial can geometrically as, ``a \( \degs{60} \)
       rotation minus two rotations of \( \degs{30} \) equals the
       identity.''
     \end{answer}
   \item 
     What is the minimal polynomial of a diagonal matrix?
     \begin{answer}
       For a diagonal matrix
       \begin{equation*}
          T=
          \begin{mat}
             t_{1,1}   &0        \\
             0         &t_{2,2}  \\
                       &        &\ddots  \\
                       &        &      &t_{n,n}
          \end{mat}
       \end{equation*}
       the characteristic polynomial is 
       $(t_{1,1}-x)(t_{2,2}-x)\cdots (t_{n,n}-x)$.     
       Of course, some of those factors may be repeated, e.g., the matrix might
       have $t_{1,1}=t_{2,2}$.
       For instance, the characteristic polynomial of
       \begin{equation*}
          D=
          \begin{mat}[r]
             3 &0 &0  \\
             0 &3 &0  \\
             0 &0 &1
          \end{mat}
       \end{equation*}
       is \( (3-x)^2(1-x)=-1\cdot (x-3)^2(x-1) \). 

       To form the minimal polynomial, 
       take the terms \( x-t_{i,i} \), throw out repeats, 
       and multiply them together.
       For instance, the minimal polynomial of $D$
       is \( (x-3)(x-1) \).
       To check this, note first that \nearbytheorem{th:CayHam}, 
       the Cayley-Hamilton theorem, requires that each linear factor in the
       characteristic polynomial appears at least once in the minimal
       polynomial.
       One way to check the other direction\Dash that in the case of
       a diagonal matrix, 
       each linear factor need appear at most once\Dash is to
       use a matrix argument.
       A diagonal matrix, multiplying from the left, rescales rows by
       the entry on the diagonal.
       But in a product $(T-t_{1,1}I)\cdots\hbox{}$, even without any repeat
       factors, every row is zero in at least one of the factors. 

       For instance, in the product 
       \begin{equation*}
         (D-3I)(D-1I)=(D-3I)(D-1I)I=
         \begin{mat}[r]
           0  &0  &0  \\
           0  &0  &0  \\
           0  &0  &-2        
         \end{mat}
         \begin{mat}[r]
           2  &0  &0  \\
           0  &2  &0  \\
           0  &0  &0
         \end{mat}
         \begin{mat}[r]
           1  &0  &0  \\
           0  &1  &0  \\
           0  &0  &1
         \end{mat}
       \end{equation*}
       because the first and second rows of the first matrix $D-3I$ are
       zero, the entire product will have a first row and second
       row that are zero.
       And because the third row of the middle matrix $D-1I$ is zero,
       the entire product has a third row of zero.
    \end{answer}
  \recommended \item 
    A \definend{projection}\index{projection}%
    \index{transformation!projection} 
    is any transformation \( t \) such that \( t^2=t \).
    (For instance, consider the transformation of the plane $\Re^2$ projecting
    each vector onto its first coordinate.
    If we project twice then we get the same result as if we project just once.)
    What is the minimal polynomial of a projection?
    \begin{answer}
      This subsection starts with the observation that the powers of 
      a linear transformation cannot climb forever without a ``repeat'',
      that is, that for some power~$n$ there is a linear relationship
      $c_n\cdot t^n+\dots+c_1\cdot t+c_0\cdot \identity=z$ where $z$ is the
      zero transformation.
      The definition of projection is that for such a map
      one linear relationship is quadratic, $t^2-t=z$.
      To finish, we need only consider whether this relationship might not
      be minimal, that is, are there projections for which the 
      minimal polynomial is constant or linear?

      For the minimal polynomial to be constant, the map would have to
      satisfy that $c_0\cdot\identity=z$, where $c_0=1$ since the leading
      coefficient of a minimal polynomial is $1$.
      This is only satisfied by the zero transformation on a trivial space.
      This is a projection, but not an interesting one.

      For the minimal polynomial of a transformation to be linear would give 
      $c_1\cdot t+c_0\cdot\identity=z$ where $c_1=1$.
      This equation gives $t=-c_0\cdot \identity$.
      Coupling it with the requirement that $t^2=t$ gives
      $t^2=(-c_0)^2\cdot\identity=-c_0\cdot\identity$, which gives that
      $c_0=0$ and $t$ is the zero transformation or that $c_0=1$ and
      $t$ is the identity.       

      Thus, except in the cases where the projection is a zero map or an
      identity map, the minimal polynomial is $m(x)=x^2-x$. 
    \end{answer}
  \item \label{exer:SomeRootsMinPolyAreEigs}
    \textit{The first two items of this question are review.}
    \begin{exparts}
      \partsitem Prove that the composition of one-to-one maps is
        one-to-one.
      \partsitem Prove that if a linear map is not one-to-one then
        at least one nonzero vector from the domain maps to the 
        zero vector in the codomain.
      \partsitem Verify the statement, excerpted here, that
         precedes \nearbytheorem{th:CayHam}.
         \begin{quotation}
           \noindent \ldots{} 
           if a minimal polynomial $m(x)$ for a transformation $t$ 
           factors as
           $m(x)=(x-\lambda_1)^{q_1}\cdots (x-\lambda_z)^{q_z}$
           then 
           \( m(t)=\composed{\composed{(t-\lambda_1)^{q_1}}{\cdots}}{
                                                        (t-\lambda_z)^{q_z}} \) 
          is the zero map. 
          Since \( m(t) \) sends every vector to zero, at least
          one of the maps \( t-\lambda_i \)  sends some
          nonzero vectors to zero.  \ldots{}
          That is, \ldots{} 
          at least some of the \( \lambda_i \) are eigenvalues.
        \end{quotation}
    \end{exparts}
    \begin{answer}
      \begin{exparts}
       \partsitem \textit{This is a property of functions in general,
          not just of linear functions.}
          Suppose that $f$ and $g$ are one-to-one functions such that
          $\composed{f}{g}$ is defined.
          Let $\composed{f}{g}(x_1)=\composed{f}{g}(x_2)$, so that
          $f(g(x_1))=f(g(x_2))$.
          Because $f$ is one-to-one this implies that $g(x_1)=g(x_2)$.
          Because $g$ is also one-to-one, this in turn implies that
          $x_1=x_2$.
          Thus, in summary, $\composed{f}{g}(x_1)=\composed{f}{g}(x_2)$
          implies that $x_1=x_2$ and so $\composed{f}{g}$ is one-to-one.
        \partsitem If the linear map $h$ 
          is not one-to-one then there are unequal
          vectors $\vec{v}_1$, $\vec{v}_2$ that map to the same value
          $h(\vec{v}_1)=h(\vec{v}_2)$.
          Because $h$ is linear, we have
          $\zero=h(\vec{v}_1)-h(\vec{v}_2)=h(\vec{v}_1-\vec{v}_2)$
          and so $\vec{v}_1-\vec{v}_2$ is a nonzero vector from the domain
          that $h$ maps to the zero vector of the codomain  
          ($\vec{v}_1-\vec{v}_2$
          does not equal the zero vector of the domain because $\vec{v}_1$
          does not equal $\vec{v}_2$).
        \partsitem The minimal polynomial 
          $m(t)$ sends every vector in the domain to 
          zero and so it is not one-to-one (except in a trivial space, which 
          we ignore).
          By the first item of this question, 
          since the composition $m(t)$ is not one-to-one, 
          at least one of the components $t-\lambda_i$ is not one-to-one.
          By the second item, $t-\lambda_i$ has a nontrivial null space.
          Because $(t-\lambda_i)(\vec{v})=\zero$ holds if and only if
          $t(\vec{v})=\lambda_i\cdot\vec{v}$, the prior sentence gives that
          $\lambda_i$ is an eigenvalue (recall that the definition of
          eigenvalue requires that the relationship hold for at least one
          nonzero $\vec{v}$).
      \end{exparts}
    \end{answer}
  \item 
    True or false:~for a transformation on an
    \( n \) dimensional space, if the minimal polynomial has degree \( n \) 
    then the map is diagonalizable.
    \begin{answer}
      This is false.
      The natural example of a non-diagonalizable transformation works here.
      Consider the transformation of $\C^2$ represented with respect to
      the standard basis by this matrix.
      \begin{equation*}
        N=
        \begin{mat}[r]
          0  &1  \\
          0  &0
        \end{mat}
      \end{equation*}
      The characteristic polynomial is $c(x)=x^2$. 
      Thus the minimal polynomial is either $m_1(x)=x$ or $m_2(x)=x^2$.
      The first is not right since $N-0\cdot I$ is not the zero matrix,
      thus in this example the minimal polynomial has degree equal to the 
      dimension of the underlying space, and, as mentioned,
      we know this matrix is not diagonalizable because it is nilpotent. 
    \end{answer}
   \item 
     Let $f(x)$ be a polynomial.
     Prove that if $A$ and $B$ are similar matrices then $f(A)$ is 
     similar to $f(B)$.
     \begin{exparts}
       \partsitem Now show that similar matrices have the same characteristic
         polynomial.
       \partsitem Show that similar matrices have the same minimal polynomial.
       \partsitem Decide if these are similar.
          \begin{equation*}
            \begin{mat}[r]
              1  &3  \\
              2  &3
            \end{mat}
            \qquad
            \begin{mat}[r]
              4  &-1 \\
              1  &1
            \end{mat}
          \end{equation*}
     \end{exparts}
     \begin{answer}
       Let \( A \) and \( B \) be similar \( A=PBP^{-1} \).
       From the facts that 
       \begin{multline*}
           A^n=(PBP^{-1})^n=(PBP^{-1})(PBP^{-1})\cdots(PBP^{-1})   \\
                           =PB(P^{-1}P)B(P^{-1}P)\cdots (P^{-1}P)BP^{-1}
                           =PB^nP^{-1}
       \end{multline*}
       and $c\cdot A=c\cdot(PBP^{-1})=P(c\cdot B)P^{-1}$ follows 
       the required fact that for any polynomial function $f$ we have 
       \( f(A)=P\,f(B)\,P^{-1} \).
       For instance, if $f(x)=x^2+2x+3$ then
       \begin{multline*}
         A^2+2A+3I=(PBP^{-1})^2+2\cdot PBP^{-1}+3\cdot I           \\
                  =(PBP^{-1})(PBP^{-1})+P(2B)P^{-1}+3\cdot PP^{-1}
                  =P(B^2+2B+3I)P^{-1}
       \end{multline*}
       shows that $f(A)$ is similar to $f(B)$.
       \begin{exparts}
         \partsitem Taking $f$ to be a linear polynomial we have that
           $A-xI$ is similar to $B-xI$.
           Similar matrices have equal determinants (since 
           $\deter{A}=\deter{PBP^{-1}}
               =\deter{P}\cdot\deter{B}\cdot\deter{P^{-1}}
               =1\cdot\deter{B}\cdot 1=\deter{B}$).
           Thus the characteristic polynomials are equal. 
         \partsitem 
           As \( P \) and \( P^{-1} \) are invertible, \( f(A) \) is the
           zero matrix when, and only when, \( f(B) \) is the zero matrix.
         \partsitem They cannot be similar since they don't have the same
           characteristic polynomial.
           The characteristic polynomial of the first one is 
           $x^2-4x-3$ while the characteristic polynomial of the 
           second is $x^2-5x+5$.
       \end{exparts}  
     \end{answer}
   \item 
    \begin{exparts}
     \partsitem Show that a matrix is invertible if and only if 
       the constant term
       in its minimal polynomial is not $0$.
     \partsitem Show that if a square matrix \( T \) is not invertible     
       then there is
       a nonzero matrix \( S \) such that \( ST \) and \( TS \) both equal the
       zero matrix.
     \end{exparts}
     \begin{answer}
      Suppose that \( m(x)=x^n+m_{n-1}x^{n-1}+\dots+m_1x+m_0 \)
      is minimal for \( T \).
      \begin{exparts}
       \partsitem 
         For the `if' argument, 
         because \( T^n+\dots+m_1T+m_0I \) is the zero matrix we
         have that \( I=(T^n+\dots+m_1T)/(-m_0)=
         T\cdot (T^{n-1}+\dots+m_1I)/(-m_0) \) and so 
         the matrix $(-1/m_0)\cdot (T^{n-1}+\dots+m_1I)$ is the inverse of $T$.
         For `only if', suppose that \( m_0=0 \) 
         (we put the \( n=1 \) case aside but it is easy) so that
         \( T^n+\dots+m_1T=(T^{n-1}+\dots+m_1I)T \) is the zero matrix.
         Note that \( T^{n-1}+\dots+m_1I \) is not the zero matrix because
         the degree of the minimal polynomial is \( n \).
         If \( T^{-1} \) exists then multiplying both
         \( (T^{n-1}+\dots+m_1I)T \) and the zero matrix from the right 
         by $T^{-1}$ gives a contradiction.
       \partsitem If \( T \) is not invertible then the constant term in its
         minimal polynomial is zero.
         Thus,
         \begin{equation*}
            T^n+\dots+m_1T=(T^{n-1}+\dots+m_1I)T=T(T^{n-1}+\dots+m_1I)
         \end{equation*}
         is the zero matrix.
       \end{exparts} 
     \end{answer}
   \recommended \item \label{exer:PolyMapsFactor}
      \begin{exparts}
        \partsitem Finish the proof of \nearbylemma{le:PolyMapsFactor}.
        \partsitem Give an example to show that the result does not hold
          if $t$ is not linear.
      \end{exparts}
      \begin{answer}
        \begin{exparts}
          \partsitem
            For the inductive step, assume that \nearbylemma{le:PolyMapsFactor}
            is true for polynomials
            of degree \( i,\ldots,k-1 \) and consider a polynomial \( f(x) \)
            of degree \( k \). 
            Factor $f(x)=k(x-\lambda_1)^{q_1}\cdots(x-\lambda_z)^{q_z}$
            and let
            \( k(x-\lambda_1)^{q_1-1}\cdots(x-\lambda_z)^{q_z} \)
            be \( c_{n-1}x^{n-1}+\cdots+c_1x+c_0 \).
            Substitute:
            \begin{align*}
               k\composed{\composed{(t-\lambda_1)^{q_1}}{\cdots}}{
                                            (t-\lambda_z)^{q_z}}(\vec{v})
               &=
               \composed{(t-\lambda_1)}{
                  \composed{\composed{(t-\lambda_1)^{q_1}}{\cdots}}{
                  (t-\lambda_z)^{q_z}} }
                          (\vec{v})    \\
               &=
               (t-\lambda_1)\,(c_{n-1}t^{n-1}(\vec{v})+\cdots+c_0\vec{v}) \\
               &=
               f(t)(\vec{v})
            \end{align*}
           (the second equality follows from the inductive hypothesis and 
           the third from the linearity of \( t \)).
         \partsitem One example is to consider the squaring map
            $\map{s}{\Re}{\Re}$ given by $s(x)=x^2$.
            It is nonlinear.
            The action defined by the polynomial $f(t)=t^2-1$ 
            changes $s$ to $f(s)=s^2-1$, which is this map.
            \begin{equation*} 
              x\mapsunder{s^2-1} \composed{s}{s}(x)-1=x^4-1
            \end{equation*}
            Observe that this map differs from the map
            $\composed{(s-1)}{(s+1)}$; for instance, the first map takes
            $x=5$ to $624$ while the second one takes $x=5$ to $675$.
       \end{exparts} 
     \end{answer}
%  \item
%    Give an example of two \( \nbyn{4} \) matrices that have the 
%    same characteristic polynomial and the same minimal polynomial but
%    nonetheless are not similar.
%    \begin{answer}
%      These two have the same characteristic polynomial,
%      $c_S(x)=c_T(x)=(x-2)^4$.
%      \begin{equation*}
%        S=
%        \begin{mat}[r]
%          2  &0  &0  &0  \\
%          1  &2  &0  &0  \\
%          0  &0  &2  &0  \\
%          0  &0  &0  &2
%        \end{mat}
%        \qquad
%        T=
%        \begin{mat}[r]
%          2  &0  &0  &0  \\
%          1  &2  &0  &0  \\
%          0  &0  &2  &0  \\
%          0  &0  &1  &2
%        \end{mat}
%      \end{equation*}
%      For each of the two
%      the candidates for the minimal polynomial are 
%      $m_1(x)=(x-2)$, $m_2(x)=(x-2)^2$, $m_3(x)=(x-2)^3$, and 
%      $m_4(x)=(x-2)^4$.
%      We have 
%      \begin{equation*}
%        S-2I=
%        \begin{mat}[r]
%          0  &0  &0  &0  \\
%          1  &0  &0  &0  \\
%          0  &0  &0  &0  \\
%          0  &0  &0  &0
%        \end{mat}
%        \qquad
%        T-2I=
%        \begin{mat}[r]
%          0  &0  &0  &0  \\
%          1  &0  &0  &0  \\
%          0  &0  &0  &0  \\
%          0  &0  &1  &0
%        \end{mat}
%      \end{equation*}
%      and
%      \begin{equation*}
%        (S-2I)^2=
%        \begin{mat}[r]
%          0  &0  &0  &0  \\
%          0  &0  &0  &0  \\
%          0  &0  &0  &0  \\
%          0  &0  &0  &0
%        \end{mat}
%        =
%        (T-2I)^2
%      \end{equation*}
%      and so the minimal polynomial for each is $m_2$.
%    \end{answer}
  \item
    Any transformation or square matrix has a minimal polynomial.
    Does the converse hold?
    \begin{answer}
      Yes.
      Expand down the last column to check that
      \( x^n+m_{n-1}x^{n-1}+\dots+m_1x+m_0 \) is plus or minus the
      determinant of this.
      \begin{equation*}
         \begin{mat}
            -x  &0  &0  &      &    &m_0    \\
             0  &1-x&0  &      &    &m_1    \\
             0  &0  &1-x&      &    &m_2    \\
                &   &   &\ddots             \\
                &   &   &      &1-x &m_{n-1}
         \end{mat}
      \end{equation*} 
     \end{answer}
\end{exercises}


\subsectionoptional{Jordan Canonical Form}
We are looking for a canonical form for matrix similarity.
This subsection completes this program by moving
from the canonical form for the classes 
of nilpotent matrices to
the canonical form for all classes.

\begin{lemma} \label{le:NilIffOnlyEigenZero}
A linear transformation on a nontrivial vector space 
is nilpotent if and only if its only eigenvalue is zero.
\end{lemma}

\begin{proof}
Let the linear transformation be $\map{t}{V}{V}$.
If $t$ is nilpotent then there is an $n$ such that $t^n$ is the zero map,
so $t$ satisfies the polynomial $p(x)=x^n=(x-0)^n$.
By \nearbylemma{le:tSatisImpMinPolyDivides} the minimal polynomial of 
$t$ divides $p$, so the minimal polynomial has only zero for a root.
By Cayley-Hamilton, \nearbytheorem{th:CayHam},
the characteristic polynomial has only zero for a root.
Thus the only eigenvalue of $t$ is zero.

Conversely, if a transformation \( t \) on an
\( n \)-dimensional space has only the single eigenvalue of zero 
then its characteristic polynomial is \( x^n \). 
\nearbylemma{le:MatSatItsCharPoly} says that a map satisfies its
characteristic polynomial so \( t^n \) is the zero map.
Thus $t$ is nilpotent.
\end{proof}

\noindent The `nontrivial vector space' is in the statement of 
that lemma because on a trivial
space $\set{\zero}$ the only transformation is the zero map, which has
no eigenvalues because there are no associated nonzero eigenvectors.

\begin{corollary} \label{cor:tMinLambdaNilpotent}
The transformation $t-\lambda$ is nilpotent if and only if 
$t$'s only eigenvalue is $\lambda$.   
\end{corollary}

\begin{proof}
The transformation \( t-\lambda \) is nilpotent if and only if
$t-\lambda$'s only eigenvalue is \( 0 \).
That holds if and only if $t$'s only eigenvalue is $\lambda$, because
\( t(\vec{v})=\lambda\vec{v} \) if and 
only if \( (t-\lambda)\,(\vec{v})=0\cdot\vec{v} \).
\end{proof}

We already have the canonical form that we want for 
the case of nilpotent matrices, 
that is, for each matrix whose only eigenvalue is zero.
Corollary~III.\ref{cor:NilpotentMatCanonForm} says that each 
such matrix is similar to one that is all
zeroes except for blocks of subdiagonal ones.
% (To make this representation unique we can fix some arrangement of
% the blocks, perhaps from longest to shortest.)

\begin{lemma}   \label{le:SimRespAddScalar}
If the matrices \( T-\lambda I \) and \( N \) are similar 
then \( T \) and \( N+\lambda I \) are also similar,
via the same change of basis matrices.
\end{lemma}

\begin{proof}
With \( N=P(T-\lambda I)P^{-1}=PTP^{-1}-P(\lambda I)P^{-1} \)
we have $N=PTP^{-1}-PP^{-1}(\lambda I)$
since the diagonal matrix \( \lambda I \) commutes with anything, 
and so \( N=PTP^{-1}-\lambda I \).
Therefore \( N+\lambda I=PTP^{-1} \).
\end{proof}

\begin{example}   \label{ex:SingJordBlock}
The characteristic polynomial of
\begin{equation*}
  T=\begin{mat}[r]
      2  &-1  \\
      1  &4
    \end{mat}
\end{equation*}
is \( (x-3)^2 \) and so \( T \) has only the single eigenvalue \( 3 \).
Thus for 
\begin{equation*}
  T-3I=\begin{mat}[r]
     -1  &-1  \\
      1  &1
    \end{mat}
\end{equation*}
the only eigenvalue is \( 0 \) and \( T-3I \) is nilpotent.
Finding the null spaces is routine; to ease this computation we take 
$T$ to represent a transformation $\map{t}{\C^2}{\C^2}$ with respect to
the standard basis (we shall do this
for the rest of the chapter). 
\begin{equation*}
   \nullspace{t-3}=\set{\colvec{-y \\ y}\suchthat y\in\C}
   \qquad
   \nullspace{(t-3)^2}=\C^2
\end{equation*}
The dimension of each null space
shows that the action of the map $t-3$ on a string basis is
$\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero$.
Thus, here is the canonical form for $t-3$
with one choice for a string basis.
\begin{equation*}
  \rep{t-3}{B,B}
  =N
  =\begin{mat}[r]
      0  &0   \\
      1  &0
    \end{mat}
  \qquad
  B=\sequence{\colvec[r]{1 \\ 1},\colvec[r]{-2 \\ 2}}
\end{equation*}
By \nearbylemma{le:SimRespAddScalar}, \( T \) is similar to
this matrix.
\begin{equation*}
  \rep{t}{B,B}=
  N+3I=
  \begin{mat}[r]
     3  &0  \\
     1  &3
  \end{mat}
\end{equation*}
We can produce the similarity computation.
Recall how to find the change of
basis matrices $P$ and $P^{-1}$ to express \( N \) as \( P(T-3I)P^{-1} \).
The similarity diagram
\begin{equation*}
  \begin{CD}
    \C^2_{\wrt{\stdbasis_2}}      @>t-3>T-3I>      \C^2_{\wrt{\stdbasis_2}}     \\
    @V\scriptstyle\identity V\scriptstyle PV  
                                 @V\scriptstyle\identity V\scriptstyle PV \\
    \C^2_{\wrt{B}}                 @>t-3>N>         \C^2_{\wrt{B}}
  \end{CD}
\end{equation*}
describes that to move from the lower left to the upper left we multiply by
\begin{equation*}
  P^{-1}=\bigl(\rep{\identity}{\stdbasis_2,B}\bigr)^{-1}
    =\rep{\identity}{B,\stdbasis_2}
    =\begin{mat}[r]
        1  &-2  \\
        1  &2
     \end{mat}
\end{equation*}
and to move from the upper right to the lower right we multiply by
this matrix.
\begin{equation*}
  P=\begin{mat}[r]
      1  &-2  \\
      1  &2
     \end{mat}^{-1}\!\!
   =\begin{mat}[r]
      1/2  &1/2  \\
      -1/4 &1/4
   \end{mat}
\end{equation*}
So this equation expresses the similarity.
\begin{equation*}
  \begin{mat}[r]
     3  &0  \\
     1  &3
  \end{mat}
  =
  \begin{mat}[r]
      1/2  &1/2  \\
      -1/4 &1/4
   \end{mat}
   \begin{mat}[r]
      2  &-1  \\
      1  &4
    \end{mat}
    \begin{mat}[r]
        1  &-2  \\
        1  &2
     \end{mat}
\end{equation*}
\end{example}

\begin{example}
This matrix has characteristic polynomial \( (x-4)^4 \) 
\begin{equation*}
  T=
  \begin{mat}[r]
    4  &1  &0  &-1  \\
    0  &3  &0  &1   \\
    0  &0  &4  &0   \\
    1  &0  &0  &5
   \end{mat}
\end{equation*}
and so has the single eigenvalue $4$.
The
null space of $t-4$ has dimension two, the null space of $(t-4)^2$
has dimension three, and the null space of $(t-4)^3$ has dimension four.
Thus, $t-4$ has the action on a string basis of
$\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero$ and
$\vec{\beta}_4\mapsto\zero$.
This gives the canonical form $N$ for $t-4$, which in turn gives the
form for \( t \).
\begin{equation*}
  N+4I=
  \begin{mat}[r]
    4  &0  &0  &0   \\
    1  &4  &0  &0   \\
    0  &1  &4  &0   \\
    0  &0  &0  &4
   \end{mat}
\end{equation*}
\end{example}

An array that is all zeroes, except for some number $\lambda$
down the diagonal and blocks of subdiagonal ones, is a 
\definend{Jordan block}.\index{Jordan block}
We have shown that Jordan block matrices are
canonical representatives of the similarity classes of single-eigenvalue
matrices.

\begin{example}
The \( \nbyn{3} \) matrices whose only eigenvalue is \( 1/2 \) separate into
three similarity classes.
The three classes have these canonical representatives.
\begin{equation*}
  \begin{mat}[r]
     1/2  &0    &0  \\
     0    &1/2  &0  \\
     0    &0    &1/2
   \end{mat}
   \qquad 
   \begin{mat}[r]
     1/2  &0    &0  \\
     1    &1/2  &0  \\
     0    &0    &1/2
   \end{mat}
   \qquad 
   \begin{mat}[r]
     1/2  &0    &0  \\
     1    &1/2  &0  \\
     0    &1    &1/2
   \end{mat}
\end{equation*}
In particular, this matrix
\begin{equation*}
   \begin{mat}[r]
     1/2  &0    &0    \\
     0    &1/2  &0    \\
     0    &1    &1/2
   \end{mat}
\end{equation*}
belongs to the similarity class represented by the middle one, because we have
adopted the convention of ordering the blocks of subdiagonal ones from the 
longest block to the shortest.
\end{example}

We will finish the program of this chapter by extending this work to 
cover maps and matrices with multiple eigenvalues.
The best possibility for general maps and matrices would be
if we could break them into a part involving 
their first eigenvalue \( \lambda_1 \) 
(which we represent using its Jordan block),
a part with \( \lambda_2 \), etc.

This best possibility is what happens.
For any transformation \( \map{t}{V}{V} \),
we shall break the space \( V \) into the direct sum of a part on which
\( t-\lambda_1 \) is nilpotent, a part on which \( t-\lambda_2 \)
is nilpotent, etc.
% More precisely, we shall take three steps to get to this section's major
% theorem and the third step shows that
% \( V=\gennullspace{t-\lambda_1}\directsum\cdots\directsum
%        \gennullspace{t-\lambda_z} \)
% where \( \lambda_1,\ldots,\lambda_z \) are \( t \)'s eigenvalues.

Suppose that \( \map{t}{V}{V} \) is a linear transformation.
The restriction %\appendrefs{restrictions of functions}\spacefactor=1000  %
of \( t \) to a subspace \( M \) need not be a linear transformation on \( M \)
because there may be an \( \vec{m}\in M \)
with \( t(\vec{m})\not\in M \) (for instance, the transformation 
that rotates the plane by
a quarter turn does not map most members of the $x=y$ line subspace back within
that subspace).
To ensure that the restriction of a transformation 
to a part of a space is a transformation on the part we need the next 
condition. 

\begin{definition} \label{def:invariant}
Let \( \map{t}{V}{V} \) be a transformation.
A subspace \( M \) is \definend{$t$ invariant}%
\index{invariant subspace}\index{subspace!invariant}
if whenever \( \vec{m}\in M \) then \( t(\vec{m})\in M \)
(shorter: \( t(M)\subseteq M \)).
\end{definition}
 
Recall that Lemma~III.\ref{le:RangeAndNullChains} shows that for any
transformation~$t$ on an $n$~dimensional space
the range spaces of iterates are stable 
\begin{equation*}
  \rangespace{t^n}=\rangespace{t^{n+1}}=\cdots=\genrangespace{t}
\end{equation*} 
as are the null spaces. 
\begin{equation*}
  \nullspace{t^n}=\nullspace{t^{n+1}}=\cdots=\gennullspace{t}
\end{equation*} 
Thus,
the generalized null space $\gennullspace{t}$ and the generalized range space
$\genrangespace{t}$ are $t$~invariant.
% For the generalized null space, if $\vec{v}\in\gennullspace{t}$ then
% $t^n(\vec{v})=\zero$ where $n$ is the dimension of the underlying space
% and so $t(\vec{v})\in\gennullspace{t}$ because
% $t^n(\,t(\vec{v})\,)$ is zero also.
% For the generalized rangespace, if $\vec{v}\in\genrangespace{t}$ then
% $\vec{v}=t^n(\vec{w})$ for some $\vec{w}$ and then 
% $t(\vec{v})=t^{n+1}(\vec{w})=t^n(\,t(\vec{w})\,)$ 
% shows that $t(\vec{v})$ is also a member of $\genrangespace{t}$.
In particular, 
$\gennullspace{t-\lambda_i}$ and $\genrangespace{t-\lambda_i}$
are $t-\lambda_i$~invariant.

The action of the transformation $t-\lambda_i$ on $\gennullspace{t-\lambda_i}$
is especially easy to understand.
Observe that any transformation~$t$ is nilpotent on $\gennullspace{t}$, 
because if $\vec{v}\in\gennullspace{t}$ then 
by definition $t^n(\vec{v})=\zero$. 
Thus $t-\lambda_i$ is nilpotent on 
$\gennullspace{t-\lambda_i}$.

% Thus, the generalized null space $\gennullspace{t-\lambda_i}$ is a part of
% the space on which the action of $t-\lambda_i$ is easy to understand.

We shall take three steps to prove this section's major result. 
The next result is the first.
% It establishes that if $j=neq i$ then \( t-\lambda_j \) leaves
% \( t-\lambda_i \)'s part unchanged.

\begin{lemma} \label{le:tInvIfftMinLambdaInv}
A subspace is \( t \) invariant if and only if 
it is \( t-\lambda \) invariant for all scalars \( \lambda \).
In particular, 
if \( \lambda_i \) is an eigenvalue of  a linear transformation
\( t \) then for any other eigenvalue $\lambda_j$
the spaces \( \gennullspace{t-\lambda_i} \) 
and \( \genrangespace{t-\lambda_i} \)
are \( t-\lambda_j \) invariant.
\end{lemma}

\begin{proof}
For the first sentence we check the two implications separately.
The `if' half is easy: if the subspace is $t-\lambda$ invariant for 
all scalars $\lambda$ then using $\lambda=0$ shows that it is $t$ invariant.
For `only if' suppose that the subspace is $t$ invariant,
so that if $\vec{m}\in M$ then $t(\vec{m})\in M$, and let $\lambda$ be 
a scalar.
The subspace $M$ is closed under linear combinations and so if 
$t(\vec{m})\in M$ then $t(\vec{m})-\lambda\vec{m}\in M$.
Thus if $\vec{m}\in M$ then $(t-\lambda)\,(\vec{m})\in M$.

The lemma's second sentence follows from its first.
The
two spaces are $t-\lambda_i$~invariant so they are \( t \)~invariant.
Apply the first sentence again to
conclude that they are also \( t-\lambda_j \) invariant.
\end{proof}

The second step of the three that we will take to prove this
section's major result makes use of an additional property of 
\( \gennullspace{t-\lambda_i} \) and
\( \genrangespace{t-\lambda_i} \), that they are complementary.
Recall that if a space is the direct sum of two others 
\( V=\mathscr{N}\directsum \mathscr{R} \) 
then any vector \( \vec{v} \) in the space breaks into
two parts \( \vec{v}=\vec{n}+\vec{r} \) where \( \vec{n}\in \mathscr{N} \) and
\( \vec{r}\in \mathscr{R} \), and recall also 
that if \( B_{\mathscr{N}} \) and \( B_{\mathscr{R}} \) are bases for
\( \mathscr{N} \) and \( \mathscr{R} \) then the concatenation
\( \cat{B_{\mathscr{N}}}{B_{\mathscr{R}}} \) is linearly independent.
The next result says that for any subspaces
\( \mathscr{N} \) and \( \mathscr{R} \) that are complementary 
as well as \( t \)~invariant,
the action
of \( t \) on \( \vec{v} \) breaks into the actions of
\( t \) on \( \vec{n} \) and on \( \vec{r} \).

\begin{lemma} \label{le:InvCompSubspSplitTrans}
Let \( \map{t}{V}{V} \) be a transformation and let \( \mathscr{N} \) and 
\( \mathscr{R} \) be
\( t \) invariant complementary subspaces of \( V \).
Then we can represent \( t \) by a matrix with
blocks of square submatrices $T_1$ and $T_2$
\begin{equation*} \renewcommand{\arraystretch}{1.2}
  \begin{pmat}{c|c}
      T_1   &Z_2  \\  \cline{1-2}
      Z_1 &T_2
   \end{pmat}
   \begin{array}{@{}l}
     \} \text{\ $\dim(\mathscr{N})$-many rows}  \\
     \} \text{\ $\dim(\mathscr{R})$-many rows}
   \end{array}
\end{equation*}
where \( Z_1 \) and \( Z_2 \) are blocks of zeroes.
\end{lemma}

\begin{proof}
Since the two subspaces are complementary, the concatenation of a basis
for \( \mathscr{N} \) with a basis for \( \mathscr{R} \) makes a basis
\( B=\sequence{\vec{\nu}_1,\dots,\vec{\nu}_p,
        \vec{\mu}_1,\ldots,\vec{\mu}_q}  \)
for \( V \).
We shall show that the matrix
\begin{equation*}
  \rep{t}{B,B}=
  \begin{pmat}{c@{\hspace*{1em}}c@{\hspace*{1em}}c}
     \vdots                   &        &\vdots     \\
     \rep{t(\vec{\nu}_1)}{B}  &\cdots  &\rep{t(\vec{\mu}_q)}{B}  \\
     \vdots                   &        &\vdots     \\
  \end{pmat}
\end{equation*}
has the desired form.

Any vector \( \vec{v}\in V \) is a member of \( \mathscr{N} \) 
if and only if when it is represented with respect to \( B \)
the final \( q \)
coefficients are zero.
As \( \mathscr{N} \) is \( t \)~invariant, each of the vectors
\( \rep{t(\vec{\nu}_1)}{B} \),
\ldots, \( \rep{t(\vec{\nu}_p)}{B} \) has this form.
Hence the lower left of \( \rep{t}{B,B} \) is all zeroes.
The argument for the upper right is similar.
\end{proof}

To see that we have decomposed \( t \) into its action on the parts, 
let $B_{\mathscr{N}}=\sequence{\vec{\nu}_1,\dots,\vec{\nu}_p}$ and 
$B_{\mathscr{R}}=\sequence{\vec{\mu}_1,\ldots,\vec{\mu}_q}$.
The restrictions of \( t \) to the subspaces \( \mathscr{N} \) 
and~\( \mathscr{R} \) 
are represented
with respect to the bases $B_{\mathscr{N}},B_{\mathscr{N}}$
and $B_{\mathscr{R}},B_{\mathscr{R}}$
by the matrices \( T_1 \) and \( T_2 \).
So with subspaces that are invariant and complementary 
we can split the problem of examining
a linear transformation into two lower-dimensional subproblems.
The next result illustrates this decomposition into blocks.

\begin{lemma} \label{le:DetIsProdOfSubDets}
If $T$ is a matrix with square submatrices $T_1$ and $T_2$
\begin{equation*} \renewcommand{\arraystretch}{1.2}
  T=
  \begin{pmat}{c|c}
      T_1   &Z_2  \\  \cline{1-2}
      Z_1   &T_2
   \end{pmat}
\end{equation*}
where the \( Z \)'s are blocks of zeroes,
then \( \deter{T}=\deter{T_1}\cdot\deter{T_2} \).
\end{lemma}

\begin{proof}
Suppose that \( T \) is \( \nbyn{n} \),
that \( T_1 \) is \( \nbyn{p} \),
and that \( T_2 \) is \( \nbyn{q} \).
In the permutation formula for the determinant
\begin{equation*}
  \deter{T}=
  \sum_{\text{\scriptsize permutations\ }\phi}
          t_{1,\phi(1)}t_{2,\phi(2)}\cdots t_{n,\phi(n)}\sgn(\phi)
\end{equation*}
each term comes from a rearrangement of the column numbers
\( 1,\dots,n \) into a new order \( \phi(1),\dots,\phi(n) \).
The upper right block $Z_2$ is all zeroes, so if a
\( \phi \) has at least one of \( p+1,\dots,n \) among its first
\( p \) column numbers \( \phi(1),\dots,\phi(p) \) then the term arising
from \( \phi \) does not contribute to the sum because it is zero,
e.g., if \( \phi(1)=n \) then
\( t_{1,\phi(1)}t_{2,\phi(2)}\dots t_{n,\phi(n)}
   =0\cdot t_{2,\phi(2)}\dots t_{n,\phi(n)}=0 \).

So the above formula reduces to a sum over all permutations with 
two halves:~any contributing $\phi$ is the composition of a $\phi_1$ that
rearranges only \( 1,\dots,p \) 
and a $\phi_2$ that rearranges only \( p+1,\dots,p+q \).
Now, the distributive law 
and the fact that the signum of a composition is the product
of the signums gives that this
\begin{multline*}
   \deter{T_1}\cdot\deter{T_2}=
   \bigg(\sum_{\begin{subarray}{c}
                \text{\scriptsize perms\ }\phi_1 \\
                \text{\scriptsize of\ } 1,\dots,p
               \end{subarray}}
       \!\!\! t_{1,\phi_1(1)}\cdots t_{p,\phi_1(p)}\sgn(\phi_1) \bigg)  \\
   \cdot
   \bigg(\sum_{\begin{subarray}{c}
                \text{\scriptsize perms\ }\phi_2 \\
                \text{\scriptsize of\ } p+1,\dots,p+q
               \end{subarray}}
       \!\!\! t_{p+1,\phi_2(p+1)}\cdots t_{p+q,\phi_2(p+q)}\sgn(\phi_2) 
        \bigg)
\end{multline*}
equals
  $\deter{T}=
  \sum_{\text{\scriptsize contributing\ }\phi}
          t_{1,\phi(1)}t_{2,\phi(2)}\cdots t_{n,\phi(n)}\sgn(\phi)$.
\end{proof}

\begin{example}
\begin{equation*}
    \begin{vmat}[r]
       2  &0  &0  &0  \\
       1  &2  &0  &0  \\
       0  &0  &3  &0  \\
       0  &0  &0  &3
    \end{vmat}
   =\begin{vmat}[r]
       2  &0  \\
       1  &2
    \end{vmat}
    \cdot
    \begin{vmat}[r]
       3  &0  \\
       0  &3
    \end{vmat}
   =36
\end{equation*}
\end{example}

From \nearbylemma{le:DetIsProdOfSubDets} we conclude that
if two subspaces 
are complementary and \( t \)~invariant then
\( t \) is one-to-one if and only if its 
restriction %\appendrefs{restrictions}
to each subspace is nonsingular.

Now for the promised third, and final, step to the main result.

\begin{lemma}
If a linear transformation \( \map{t}{V}{V} \) has the 
characteristic polynomial
\( (x-\lambda_1)^{p_1}\dots(x-\lambda_k)^{p_k} \) then
(1)~\( V=\gennullspace{t-\lambda_1}\directsum\cdots
             \directsum\gennullspace{t-\lambda_k} \) 
and
(2)~\( \dim(\gennullspace{t-\lambda_i})=p_i  \).
\end{lemma}

\begin{proof}
This argument consists of proving
two preliminary claims, followed by proofs of clauses~(1) and~(2).

The first claim is that
\( \gennullspace{t-\lambda_i}\intersection\gennullspace{t-\lambda_j}=\set{\zero} \)
when \( i\neq j \).
By \nearbylemma{le:tInvIfftMinLambdaInv}
both \( \gennullspace{t-\lambda_i} \) and 
\( \gennullspace{t-\lambda_j} \) are \( t \)~invariant.
The intersection of \( t \) invariant subspaces is \( t \)
invariant and so the restriction of \( t \) to
\( \gennullspace{t-\lambda_i}\intersection\gennullspace{t-\lambda_j} \)
is a linear transformation.
Now,
$t-\lambda_i$ is nilpotent on \( \gennullspace{t-\lambda_i} \)
and  
$t-\lambda_j$ is nilpotent on \( \gennullspace{t-\lambda_j} \),
so both \( t-\lambda_i \) and \( t-\lambda_j \) are 
nilpotent on the intersection.
Therefore by \nearbylemma{le:NilIffOnlyEigenZero} and the observation
following it, if \( t \) has any eigenvalues on the intersection 
then the ``only'' eigenvalue is both
\( \lambda_i \) and \( \lambda_j \).
This cannot be, so the restriction has no eigenvalues:
\( \gennullspace{t-\lambda_i}\intersection\gennullspace{t-\lambda_j} \)
is the trivial space
(\nearbylemma{le:MapNonTrivSpHasEigen} shows that 
the only transformation that is without any eigenvalues is 
the transformation on the trivial space).

The second claim is that 
\( \gennullspace{t-\lambda_i} \subseteq \genrangespace{t-\lambda_j} \),
where \( i\neq j \).
To verify it we will show that \( t-\lambda_j \) is one-to-one on
\( \gennullspace{t-\lambda_i} \) so that, 
since \( \gennullspace{t-\lambda_i} \) is \( t-\lambda_j \) invariant
by \nearbylemma{le:tInvIfftMinLambdaInv},
the map \( t-\lambda_j \) is an automorphism of the subspace
\( \gennullspace{t-\lambda_i} \)
and therefore that 
\( \gennullspace{t-\lambda_i} \)  is a subset of each
\( \rangespace{t-\lambda_j} \), \( \rangespace{(t-\lambda_j)^2} \), etc.
For the verification that the map is one-to-one
suppose that \( \vec{v}\in\gennullspace{t-\lambda_i} \) is in the 
null space of \( t-\lambda_j \), aiming to show that \( \vec{v}=\zero \).
Consider the map 
\( [(t-\lambda_i)-(t-\lambda_j)]^n \).
On the one hand, the only
vector that \( (t-\lambda_i)-(t-\lambda_j)=\lambda_i-\lambda_j \) maps to 
zero is the zero vector.
On the other hand, as in the proof of 
\nearbylemma{le:PolyMapsFactor} we can 
apply the binomial expansion to get this.
\begin{equation*}
  (t-\lambda_i)^n(\vec{v})
    +\binom{n}{1}(t-\lambda_i)^{n-1}(t-\lambda_j)^1(\vec{v})
    +\binom{n}{2}(t-\lambda_i)^{n-2}(t-\lambda_j)^2(\vec{v})
    +\cdots 
\end{equation*}
The first term is zero because \( \vec{v}\in\gennullspace{t-\lambda_i} \)
while the remaining terms are zero because
\( \vec{v} \) is in the null space of \( t-\lambda_j \).
Therefore \( \vec{v}=\zero \). 

With those two preliminary claims done 
we can prove clause~(1), that the space is the direct sum of the
generalized null spaces.
By Corollary~III.\ref{GenRngNullDirSumToSp} the space is the direct sum
\( V=\gennullspace{t-\lambda_1}\directsum\genrangespace{t-\lambda_1} \).
By the second claim
\( \gennullspace{t-\lambda_2}\subseteq\genrangespace{t-\lambda_1} \)
and so we can get a basis for \( \genrangespace{t-\lambda_1} \) by
starting with a basis for \( \gennullspace{t-\lambda_2} \) and adding
extra basis elements taken from
\( \genrangespace{t-\lambda_1}\intersection\genrangespace{t-\lambda_2} \). 
Thus \( V=\gennullspace{t-\lambda_1}\directsum\gennullspace{t-\lambda_2}
          \directsum 
       (\genrangespace{t-\lambda_1}\intersection\genrangespace{t-\lambda_2}) \).
Continuing in this way we get this.
\begin{equation*}
  V=\gennullspace{t-\lambda_1}\directsum\cdots\directsum\genrangespace{t-\lambda_k}
          \directsum 
       (\genrangespace{t-\lambda_1}\intersection\cdots\intersection\genrangespace{t-\lambda_k})
\end{equation*}
The first claim above shows that the final space is trivial. 

We finish by verifying clause~(2).
Decompose \( V \) as 
\( \gennullspace{t-\lambda_i}\directsum\genrangespace{t-\lambda_i} \)
%with basis $B=\cat{B_{\mathscr{N}}}{B_{\mathscr{R}}}$
and apply \nearbylemma{le:InvCompSubspSplitTrans}.
\begin{equation*}   \renewcommand{\arraystretch}{1.2}
  T=
%  \rep{t}{B,B}=
  \begin{pmat}{c|c}
      T_1   &Z_2  \\  \cline{1-2}
      Z_1   &T_2
   \end{pmat}
   \begin{array}{@{}l}
     \} \text{\ $\dim(\,\gennullspace{t-\lambda_i}\,)$-many rows}  \\
     \} \text{\ $\dim(\,\genrangespace{t-\lambda_i}\,)$-many rows}
   \end{array}
\end{equation*}
\nearbylemma{le:DetIsProdOfSubDets} says that
\( \deter{T-xI}=\deter{T_1-xI}\cdot\deter{T_2-xI} \).
By the uniqueness clause of the Fundamental Theorem of Algebra, 
Theorem~I.\ref{th:FundThmAlg},
the determinants of the blocks have the same factors as the
characteristic polynomial
\( \deter{T_1-xI}=(x-\lambda_1)^{q_1}\cdots(x-\lambda_z)^{q_k} \)
and
\( \deter{T_2-xI}=(x-\lambda_1)^{r_1}\cdots(x-\lambda_z)^{r_k} \),
where
\( q_1+r_1=p_1 \), \dots, \( q_k+r_k=p_k \).
We will finish by establishing that (i)~$q_j=0$ for all $j\neq i$,
and (ii)~$q_i=p_i$.
Together these prove clause~(2) because they show that the degree of 
the polynomial $\deter{T_1-xI}$ is $q_i$ and
the degree of that polynomial equals 
the dimension of the
generalized null space $\gennullspace{t-\lambda_i}$. 

For (i),
because the restriction of \( t-\lambda_i \) to \( \gennullspace{t-\lambda_i} \)
is nilpotent on that space,
$t$'s only eigenvalue on that space is \( \lambda_i \),
by \nearbylemma{cor:tMinLambdaNilpotent}.
So $q_j=0$ for $j\neq i$.

For (ii),
consider the restriction of \( t \) to \( \genrangespace{t-\lambda_i} \).
By Lemma~III.\ref{lem:RestONeToOne}, the map
\( t-\lambda_i \) is one-to-one on
\( \genrangespace{t-\lambda_i} \) and so \( \lambda_i \) is not an
eigenvalue of \( t \) on that subspace.
Therefore \( x-\lambda_i \) is not a factor of \( \deter{T_2-xI} \),
so $r_i=0$, and so \( q_i=p_i \).
\end{proof}

Recall the goal of this chapter, to give a canonical form for matrix similarity.
That result is next.
It 
translates the above steps into matrix terms.

\begin{theorem}
\index{Jordan form!represents similarity classes}\index{similar!canonical form}
\index{canonical form!for similarity}\index{transformation!Jordan form for}
Any square matrix is similar to one in \definend{Jordan form}
\begin{equation*}
  \begin{mat}
    J_{\lambda_1}  &            &\text{\textit{--zeroes--}}                 \\
               &J_{\lambda_2}                                              \\
               &     &\ddots                                     \\
                &     &                           &J_{\lambda_{k-1}} &     \\
               &     &\text{\textit{--zeroes--}} &            &J_{\lambda_{k}}
  \end{mat}
\end{equation*}
where each \( J_{\lambda} \) is the Jordan block associated with an
eigenvalue $\lambda$ of the original matrix (that is, each \( J_{\lambda} \)
is all zeroes except for
\( \lambda \)'s down the diagonal and some subdiagonal ones).
\end{theorem}

\begin{proof}
Given an \( \nbyn{n} \) matrix \( T \), consider the linear map
\( \map{t}{\C^n}{\C^n} \) that it represents
with respect to the standard bases.
Use the prior lemma to write
\( \C^n=\gennullspace{t-\lambda_1}\directsum\cdots
        \directsum\gennullspace{t-\lambda_k} \)
where \( \lambda_1 \), \ldots, \(\lambda_k \) are the eigenvalues of \( t \).
Because each \( \gennullspace{t-\lambda_i} \)  is \( t \) invariant,
\nearbylemma{le:InvCompSubspSplitTrans} and the prior lemma show
that \( t \) is represented by a matrix that is all zeroes except for square
blocks along the diagonal.
To make those blocks into Jordan blocks, pick each \( B_{\lambda_i} \)
to be a string basis for the action of \( t-\lambda_i \) on
\( \gennullspace{t-\lambda_i} \). 
\end{proof}

\begin{corollary}
Every square matrix is similar to the sum of a diagonal matrix and a nilpotent
matrix.
\end{corollary}

For Jordan form a canonical form for 
matrix similarity,\index{representative!of similarity classes}
strictly speaking it must be unique.
That is, for any square matrix there needs to be one and only one matrix $J$
similar to it and of the specified form.
As stated the theorem allows us to rearrange the Jordan blocks.
We could make this form unique, say 
by arranging the Jordan blocks so the eigenvalues are in 
order, and then arranging
the blocks of subdiagonal ones from 
longest to shortest.
Below, we won't bother with that.

\begin{example} \label{ex:FirstJordForm}
This matrix 
has the characteristic polynomial \( (x-2)^2(x-6) \).
\begin{equation*}
   T=
   \begin{mat}[r]
     2  &0  &1  \\
     0  &6  &2  \\
     0  &0  &2
   \end{mat}
\end{equation*}
First we do the eigenvalue~$2$.
Computation of the powers of $T-2I$, and of the null spaces and nullities, 
is routine.
(Recall from \nearbyexample{ex:SingJordBlock} our convention
of taking $T$ to represent a transformation $\map{t}{\C^3}{\C^3}$
with respect to the standard basis.)
\begin{center}
  \renewcommand{\arraystretch}{1.25}
  \begin{tabular}{r|ccc} 
    \multicolumn{1}{c}{\( p \)}  
         &\( (T-2I)^p \) &\( \nullspace{(t-2)^p}  \) 
         &\textit{nullity}                                            \\  
    \hline
    \( 1 \)
    &\matrixvenlarge{\begin{mat}[r]
          0  &0  &1  \\
          0  &4  &2  \\
          0  &0  &0
        \end{mat}}
    &\( \set{\matrixvenlarge{\colvec{x \\ 0 \\ 0}}
         \suchthat x\in\C}  \)  
    &$1$                                                   \\
    \( 2 \)
    &\matrixvenlarge{\begin{mat}[r]
          0  &0  &0  \\
          0  &16 &8  \\
          0  &0  &0
        \end{mat}}
    &\( \set{\matrixvenlarge{\colvec{x \\ -z/2 \\  z}}
               \suchthat x,z\in\C}  \) 
    &$2$                                                   \\
    \( 3 \)
    &\matrixvenlarge{\begin{mat}[r]
          0  &0  &0  \\
          0  &64 &32 \\
          0  &0  &0
        \end{mat}}
    &\textit{--same--}
    &\textit{--same--}
  \end{tabular}
\end{center}
So the generalized null space $\gennullspace{t-2}$ has dimension two.
We know that the restriction of $t-2$ is nilpotent on this subspace.
From the way that the nullities grow we know that the action
of $t-2$ on a string basis is
$\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero$.  
Thus we can represent the restriction in the canonical form 
\begin{equation*}
  N_2=
  \begin{mat}[r]
    0  &0  \\
    1  &0  
  \end{mat}
  =\rep{t-2}{B,B}
  \qquad
   B_2=\sequence{\colvec[r]{1 \\ 1 \\ -2},
                 \colvec[r]{-2 \\ 0 \\ 0}}  
\end{equation*}
(other choices of basis are possible).
Consequently, the action of the restriction of $t$ to 
$\gennullspace{t-2}$ is represented by this matrix.
\begin{equation*}
  J_2=N_2+2I=\rep{t}{B_2,B_2}=
  \begin{mat}[r]
    2  &0  \\
    1  &2
  \end{mat}
\end{equation*}

The second eigenvalue is $6$.
Its computations are easier.
Because the power of $x-6$ in the characteristic polynomial is one,
the restriction of $t-6$ to $\gennullspace{t-6}$
must be nilpotent, of index one
(it can't be of index less than one and since $x-6$ is a 
factor of the characteristic polynomial with the exponent one it can't
be of index more than one either). 
Its action on a string basis must be $\vec{\beta}_3\mapsto\zero$ and
since it is the zero map, its canonical form $N_6$ 
is the $\nbyn{1}$ zero matrix.
Consequently, the canonical form $J_6$ for the action of $t$ on 
$\gennullspace{t-6}$ is the $\nbyn{1}$ matrix with the single entry $6$.
For the basis we can use any nonzero vector from the generalized null space.  
\begin{equation*}
   B_6=\sequence{\colvec[r]{0 \\ 1 \\ 0}}
\end{equation*}

Taken together, these two give that
the Jordan form of \( T \) is
\begin{equation*}
   \rep{t}{B,B}=
   \begin{mat}[r]
      2  &0  &0  \\
      1  &2  &0  \\
      0  &0  &6
   \end{mat}
\end{equation*}
where \( B \) is the concatenation of $B_2$ and $B_6$.
\end{example}

\begin{example}  \label{SecJordanForm}
As a contrast with the prior example, this matrix
\begin{equation*}
   T=
   \begin{mat}[r]
     2  &2  &1  \\
     0  &6  &2  \\
     0  &0  &2
   \end{mat}
\end{equation*}
has the same characteristic polynomial \( (x-2)^2(x-6) \),
but here 
\begin{center}
  % \renewcommand{\arraystretch}{1.25}
  \begin{tabular}{r|ccc}
    \multicolumn{1}{c}{\( p \)}  &\( (T-6I)^p \)  &\( \nullspace{(t-6)^p}  \)  
       &\textit{nullity} \\                                     \hline
   \( 1 \)
   &\matrixvenlarge{\begin{mat}[r]
        -4  &3  &1  \\
         0  &0  &2  \\
         0  &0  &-4
       \end{mat}}
   &\( \set{\matrixvenlarge{\colvec{x \\ (4/3)x \\ 0}}\suchthat x\in\C}  \) 
   &$1$                                                     \\
   \( 2 \)
   &\matrixvenlarge{\begin{mat}[r]
        16  &-12&-2 \\
         0  &0  &-8 \\
         0  &0  &16
       \end{mat}}
   &\textit{--same--}
   &\textit{---}
 \end{tabular}
\end{center}
the action of $t-2$ is stable after only one application\Dash the 
restriction
of $t-2$ to $\gennullspace{t-2}$ is nilpotent of index one. 
The restriction of $t-2$ to the generalized null space acts on a string
basis via the two strings $\vec{\beta}_1\mapsto\zero$ 
and $\vec{\beta}_2\mapsto\zero$.
We have this Jordan block associated with the eigenvalue~$2$.
\begin{equation*}
  J_2=
  \begin{mat}[r]
    2  &0  \\
    0  &2  
  \end{mat}
\end{equation*}

So the contrast with the prior example is that while 
the characteristic polynomial tells us to look at the 
action of $t-2$ on its generalized null space, the characteristic
polynomial does not completely describe $t-2$'s action. 
We must do some computations to find that  
the minimal polynomial is \( (x-2)(x-6) \).

For the eigenvalue $6$ the arguments for the second eigenvalue of
the prior example apply again.
The restriction of $t-6$ to $\gennullspace{t-6}$ is nilpotent of 
index one.
Thus $t-6$'s canonical form $N_6$ is the $\nbyn{1}$ zero matrix,
and the associated Jordan block $J_6$ is the $\nbyn{1}$ matrix with entry $6$.
 
%\begin{center}
%  % \renewcommand{\arraystretch}{1.25}
%  \begin{tabular}{c|cc}
%    \multicolumn{1}{c}{Therefore, \( T \) is diagonalizable.
Therefore the Jordan form for $T$ is a diagonal matrix.
\begin{equation*}
  \rep{t}{B,B}=
  \begin{mat}[r]
    2  &0  &0  \\
    0  &2  &0  \\
    0  &0  &6
  \end{mat}
  \qquad
  B=\cat{B_2}{B_6}
   =\sequence{\colvec[r]{1 \\ 0 \\ 0},
              \colvec[r]{0 \\ 1 \\ -2},
              \colvec[r]{2 \\ 4 \\ 0}}
\end{equation*}
(Checking that the third vector in $B$ is in the null space of $t-6$ is
routine.)
\end{example}

\begin{example} \label{ThirdJordanForm}
A bit of computing with
\begin{equation*}
   T=
   \begin{mat}[r]
     -1  &4  &0  &0  &0  \\
      0  &3  &0  &0  &0  \\
      0  &-4 &-1 &0  &0  \\
      3  &-9 &-4 &2  &-1 \\
      1  &5  &4  &1  &4
   \end{mat}
\end{equation*}
shows that its characteristic polynomial
is \( (x-3)^3(x+1)^2 \).
This table
\begin{center}
  % \renewcommand{\arraystretch}{1.25}
  \begin{tabular}{@{}r|c@{}c@{}c@{}}
    \multicolumn{1}{c}{\( p \)}  
            &\( (T-3I)^p \)  &\( \nullspace{(t-3)^p}  \)
             &\textit{nullity}   
    \\  \hline
    \( 1 \)
    &\matrixvenlarge{\begin{mat}[r]
         -4  &4  &0  &0  &0  \\
          0  &0  &0  &0  &0  \\
          0  &-4 &-4 &0  &0  \\
          3  &-9 &-4 &-1 &-1 \\
          1  &5  &4  &1  &1
       \end{mat}}
    &\( \set{\matrixvenlarge{\colvec{-(u+v)/2 \\
                     -(u+v)/2 \\
                      (u+v)/2 \\
                       u      \\
                       v}}\suchthat u,v\in\C}  \) 
    &$2$                                            \\
    \( 2 \)
    &\matrixvenlarge{\begin{mat}[r]
         16  &-16&0  &0  &0  \\
          0  &0  &0  &0  &0  \\
          0  &16 &16 &0  &0  \\
        -16  &32 &16 &0  &0  \\
          0  &-16&-16&0  &0
       \end{mat}}
    &\( \set{\matrixvenlarge{\colvec{ -z      \\
                      -z      \\
                       z      \\
                       u      \\
                       v}}\suchthat z,u,v\in\C}  \) 
    &$3$                                              \\
    \( 3 \)
    &\matrixvenlarge{\begin{mat}[r]
        -64  &64   &0   &0   &0  \\
          0  &0    &0   &0   &0  \\
          0  &-64  &-64 &0   &0  \\
         64  &-128 &-64 &0   &0  \\
          0  &64   &64  &0   &0
       \end{mat}}
    &\textit{--same--}
    &\textit{--same--}
  \end{tabular}
\end{center}
shows that the restriction of $t-3$ to $\gennullspace{t-3}$ acts on a 
string basis via the two strings
$\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero$
and
$\vec{\beta}_3\mapsto\zero$.

A similar calculation for the other eigenvalue
\begin{center}
  \renewcommand{\arraystretch}{1.25}
  \begin{tabular}{r|ccc}
    \multicolumn{1}{c}{\( p \)}  
         &\( (T+1I)^p \)  &\( \nullspace{(t+1)^p}  \) 
         &\textit{nullity}  \\  
    \hline
    \( 1 \)
    &\matrixvenlarge{\begin{mat}[r]
          0  &4  &0  &0  &0  \\
          0  &4  &0  &0  &0  \\
          0  &-4 &0  &0  &0  \\
          3  &-9 &-4 &3  &-1 \\
          1  &5  &4  &1  &5
       \end{mat}}
    &\( \set{\matrixvenlarge{\colvec{-(u+v)   \\
                       0      \\
                      -v      \\
                       u      \\
                       v}}\suchthat u,v\in\C}  \)  
    &$2$                                              \\
    \( 2 \)
    &\matrixvenlarge{\begin{mat}[r]
          0  &16 &0  &0  &0  \\
          0  &16 &0  &0  &0  \\
          0  &-16&0  &0  &0  \\
          8  &-40&-16&8  &-8 \\
          8  &24 &16 &8  &24
       \end{mat}}
    &\textit{--same--}
    &\textit{--same--}
  \end{tabular}
\end{center}
gives that the restriction of $t+1$ to its generalized null space
acts on a string basis via the two separate strings
$\vec{\beta}_4\mapsto\zero$ and $\vec{\beta}_5\mapsto\zero$.

Therefore
\( T \) is similar to this Jordan form matrix.
\begin{equation*}
   \begin{mat}[r]
     -1  &0  &0  &0  &0  \\
      0  &-1 &0  &0  &0  \\    
      0  &0  &3  &0  &0  \\
      0  &0  &1  &3  &0  \\
      0  &0  &0  &0  &3  
   \end{mat}
\end{equation*}
\end{example}


\begin{exercises}
  \item 
    Do the check for \nearbyexample{ex:SingJordBlock}.
    \begin{answer}
      We must check that
      \begin{equation*}
         \begin{mat}[r]
           3  &0  \\
           1  &3
         \end{mat}
         =
         N+3I=PTP^{-1}
         =
         \begin{mat}[r]
           1/2  &1/2  \\
          -1/4 &1/4
         \end{mat}
         \begin{mat}[r]
           2  &-1  \\
           1  &4
         \end{mat}
         \begin{mat}[r]
           1  &-2  \\
           1  &2
         \end{mat}
      \end{equation*}
      That calculation is easy.
    \end{answer}
  \item 
    Each matrix is in Jordan form.
    State its characteristic polynomial and its minimal polynomial.
    \begin{exparts*}
      \partsitem 
        $\begin{mat}[r]
           3  &0  \\
           1  &3        
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           -1  &0  \\
            0  &-1 
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           2  &0  &0  \\
           1  &2  &0  \\
           0  &0  &-1/2
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           3  &0  &0  \\
           1  &3  &0  \\
           0  &1  &3  \\ 
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           3  &0  &0  &0  \\
           1  &3  &0  &0  \\
           0  &0  &3  &0  \\
           0  &0  &1  &3 
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           4  &0  &0  &0  \\
           1  &4  &0  &0  \\
           0  &0  &-4 &0  \\
           0  &0  &1  &-4
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           5  &0  &0  \\
           0  &2  &0  \\
           0  &0  &3  
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           5  &0  &0  &0  \\
           0  &2  &0  &0  \\
           0  &0  &2  &0  \\
           0  &0  &0  &3 
         \end{mat}$
      \partsitem 
        $\begin{mat}[r]
           5  &0  &0  &0  \\
           0  &2  &0  &0  \\
           0  &1  &2  &0  \\
           0  &0  &0  &3 
         \end{mat}$
    \end{exparts*}
    \begin{answer}
      \begin{exparts}
        \partsitem The characteristic polynomial is $c(x)=(x-3)^2$ and
          the minimal polynomial is the same.
        \partsitem The characteristic polynomial is $c(x)=(x+1)^2$.
          The minimal polynomial is $m(x)=x+1$.
        \partsitem The characteristic polynomial is 
          $c(x)=(x+(1/2))(x-2)^2$ and
          the minimal polynomial is the same.
        \partsitem The characteristic polynomial is $c(x)=(x-3)^3$
          The minimal polynomial is the same.
        \partsitem The characteristic polynomial is $c(x)=(x-3)^4$.
          The minimal polynomial is $m(x)=(x-3)^2$.
        \partsitem The characteristic polynomial is $c(x)=(x+4)^2(x-4)^2$ and
          the minimal polynomial is the same.
        \partsitem The characteristic polynomial is 
          $c(x)=(x-2)^2(x-3)(x-5)$ and
          the minimal polynomial is $m(x)=(x-2)(x-3)(x-5)$.
        \partsitem The characteristic polynomial is 
          $c(x)=(x-2)^2(x-3)(x-5)$ and
          the minimal polynomial is the same.
      \end{exparts}
    \end{answer}
  \recommended \item
    Find the Jordan form from the given data.
    \begin{exparts}
       \partsitem The matrix 
         \( T \) is \( \nbyn{5} \) with the single eigenvalue $3$.
         The nullities of the powers are:
         \( T-3I \) has nullity two, \( (T-3I)^2 \) has nullity three,
         \( (T-3I)^3 \) has nullity four, and \( (T-3I)^4 \) has nullity
         five.
       \partsitem The matrix \( S \) is \( \nbyn{5} \) with two eigenvalues.
         For the eigenvalue $2$ the nullities are:
         \( S-2I \) has nullity two, and \( (S-2I)^2 \) has nullity four.
         For the eigenvalue $-1$ the nullities are:
         \( S+1I \) has nullity one.
    \end{exparts}
    \begin{answer}
      \begin{exparts}
         \partsitem The transformation $t-3$ is nilpotent 
          (that is, $\gennullspace{t-3}$ is the entire space)
          and it acts on a string basis via two strings, 
          $\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\vec{\beta}_3
            \mapsto\vec{\beta}_4\mapsto\zero$
          and $\vec{\beta}_5\mapsto\zero$.
          Consequently, $t-3$ can be represented in this canonical form.
          \begin{equation*}
            N_3=
            \begin{mat}[r]
               0  &0  &0  &0  &0  \\
               1  &0  &0  &0  &0  \\
               0  &1  &0  &0  &0  \\
               0  &0  &1  &0  &0  \\
               0  &0  &0  &0  &0
             \end{mat}
          \end{equation*}
          and therefore $T$ is similar to this canonical form matrix.
          \begin{equation*}
           J_3=N_3+3I=
           \begin{mat}[r]
               3  &0  &0  &0  &0  \\
               1  &3  &0  &0  &0  \\
               0  &1  &3  &0  &0  \\
               0  &0  &1  &3  &0  \\
               0  &0  &0  &0  &3
             \end{mat}
         \end{equation*} 
         \partsitem The restriction of the transformation $s+1$ is nilpotent
          on the subspace $\gennullspace{s+1}$, and the action on a 
          string basis is  $\vec{\beta}_1\mapsto\zero$.  
          The restriction of the transformation $s-2$ is nilpotent
          on the subspace $\gennullspace{s-2}$, having the action on a 
          string basis of $\vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero$
          and $\vec{\beta}_4\mapsto\vec{\beta}_5\mapsto\zero$.        
          Consequently the Jordan form is this.
          \begin{equation*}
            \begin{mat}[r]
              -1  &0  &0  &0  &0  \\
               0  &2  &0  &0  &0  \\
               0  &1  &2  &0  &0  \\
               0  &0  &0  &2  &0  \\
               0  &0  &0  &1  &2
             \end{mat} 
          \end{equation*}
      \end{exparts}  
    \end{answer}
  \item 
    Find the change of basis matrices for each example. 
    \begin{exparts*}
      \partsitem \nearbyexample{ex:FirstJordForm}
      \partsitem \nearbyexample{SecJordanForm}
      \partsitem \nearbyexample{ThirdJordanForm}
    \end{exparts*}
    \begin{answer}
      For each, because many choices of basis are possible, many other 
      answers are possible.
      Of course, the calculation to check if an answer gives that $PTP^{-1}$
      is in Jordan form is the arbiter of what's correct.
      \begin{exparts}
        \partsitem Here is the arrow diagram.
          \begin{equation*}
            \begin{CD}
              \C^3_{\wrt{\stdbasis_3}}      @>t>T>    \C^3_{\wrt{\stdbasis_3}}   \\
                @V\scriptstyle\identity V\scriptstyle PV  
                                    @V\scriptstyle\identity V\scriptstyle PV \\
              \C^3_{\wrt{B}}                 @>t>J>         \C^3_{\wrt{B}}
            \end{CD}
          \end{equation*}
          The matrix to move from the lower left to the upper left is this. 
          \begin{equation*}
            P^{-1}=\bigl(\rep{\identity}{\stdbasis_3,B}\bigr)^{-1}
                  =\rep{\identity}{B,\stdbasis_3}   
                  =\begin{mat}[r]
                     1  &-2   &0 \\
                     1  &0   &1 \\
                    -2  &0   &0
                   \end{mat}
          \end{equation*}
          The matrix $P$ to move from the upper right to the lower
          right is the inverse of $P^{-1}$.
        \partsitem We want this matrix and its inverse.
          \begin{equation*}
            P^{-1}=
            \begin{mat}[r]
              1  &0  &3  \\
              0  &1  &4  \\
              0  &-2 &0
            \end{mat}
          \end{equation*}
        \partsitem The concatenation of these bases for the 
          generalized null spaces will do for the basis for the
          entire space.
          \begin{equation*}
             B_{-1}=\sequence{\colvec[r]{-1\\ 0 \\  0 \\ 1 \\ 0},
                 \colvec[r]{-1\\ 0 \\ -1 \\ 0 \\ 1}}
            \qquad
             B_3=\sequence{\colvec[r]{1 \\ 1 \\ -1 \\ 0 \\ 0},
                 \colvec[r]{0 \\ 0 \\  0 \\-2 \\ 2},
                 \colvec[r]{-1\\-1 \\  1 \\ 2 \\ 0}}
          \end{equation*}
          The change of basis matrices are this one and its inverse.
          \begin{equation*}
            P^{-1}=
            \begin{mat}[r]
              -1  &-1  &1  &0  &-1  \\
              0   &0   &1  &0  &-1  \\
              0   &-1  &-1 &0  &1   \\
              1   &0   &0  &-2 &2   \\
              0   &1   &0  &2  &0   \\
            \end{mat}
          \end{equation*}
      \end{exparts}
    \end{answer}
  \recommended \item 
    Find the Jordan form and a Jordan basis for each matrix.
    \begin{exparts*}
      \partsitem 
        \(
        \begin{mat}[r]
          -10  &4  \\
          -25  &10
        \end{mat} \)
      \partsitem 
        \(
        \begin{mat}[r]
           5   &-4 \\
           9   &-7
        \end{mat} \)
      \partsitem 
        \(
        \begin{mat}[r]
           4   &0    &0  \\
           2   &1    &3  \\
           5   &0    &4
        \end{mat} \)
      \partsitem 
        \(
        \begin{mat}[r]
           5   &4    &3  \\
          -1   &0    &-3 \\
           1   &-2   &1
        \end{mat} \)
      \partsitem
        \(
        \begin{mat}[r]
           9   &7    &3  \\
          -9   &-7   &-4 \\
           4   &4    &4
        \end{mat} \)
      \partsitem 
        \(
        \begin{mat}[r]
           2   &2    &-1 \\
          -1   &-1   &1  \\
          -1   &-2   &2
        \end{mat} \)
      \partsitem 
        \(
        \begin{mat}[r]
           7   &1    &2   &2 \\
           1   &4    &-1  &-1\\
          -2   &1    &5   &-1\\
           1   &1    &2   &8
        \end{mat} \)
    \end{exparts*}
    \begin{answer}
      The general procedure is to factor the characteristic polynomial 
      $c(x)=(x-\lambda_1)^{p_1}(x-\lambda_2)^{p_2}\cdots $ 
      to get the eigenvalues $\lambda_1$, $\lambda_2$, etc. 
      Then, for each $\lambda_i$ we find a 
      string basis for the action of the transformation $t-\lambda_i$
      when restricted to $\gennullspace{t-\lambda_i}$,
      by computing the powers of the matrix $T-\lambda_iI$ and finding
      the associated null spaces, until these null spaces settle down
      (do not change), at which point we have the generalized null space.
      The dimensions of those null spaces (the nullities) tell us the
      action of $t-\lambda_i$ on a string basis for the generalized
      null space, and so we can write the pattern of subdiagonal ones
      to have $N_{\lambda_i}$.
      From this matrix, the Jordan block $J_{\lambda_i}$ associated
      with $\lambda_i$ is immediate $J_{\lambda_i}=N_{\lambda_i}+\lambda_iI$.
      Finally, after we have done this for each eigenvalue, we put them
      together into the canonical form.
      \begin{exparts}
        \partsitem The characteristic polynomial of this matrix
           is $c(x)=(-10-x)(10-x)+100=x^2$, 
           so it has only the single eigenvalue $\lambda=0$.
           \begin{center}
             \renewcommand{\arraystretch}{1.25}
             \begin{tabular}{r|ccc}
                \multicolumn{1}{c}{\textit{power}~$p$} 
                    &$(T+0\cdot I)^p$ &$\nullspace{(t-0)^p}$
                    &\textit{nullity}                   \\ 
                \hline
                $1$  
                &\matrixvenlarge{\begin{mat}[r]
                  -10  &4  \\ 
                  -25  &10
                \end{mat}}
                &$\set{\matrixvenlarge{\colvec{2y/5 \\ y}}
                                     \suchthat
                                     y\in\C}$
                &$1$                                       \\
                $2$  
                &\matrixvenlarge{\begin{mat}[r]
                    0  &0  \\ 
                    0  &0
                \end{mat}}
                &$\C^2$
                &$2$
             \end{tabular}
           \end{center}
           (Thus, this transformation is nilpotent: 
           $\gennullspace{t-0}$ is the entire space).
           From the nullities we know that $t$'s
           action on a string basis is 
           $\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero$.
           This is the canonical form matrix for the action of $t-0$ on
           $\gennullspace{t-0}=\C^2$
           \begin{equation*}
             N_0=
             \begin{mat}[r]
               0  &0  \\
               1  &0
            \end{mat}
           \end{equation*}
           and this is the Jordan form of the matrix.
           \begin{equation*}
             J_0=N_0+0\cdot I=
             \begin{mat}[r]
               0  &0  \\
               1  &0
            \end{mat}
           \end{equation*}
           Note that if a matrix is nilpotent then its canonical form
           equals its Jordan form.

           We can find such a string basis using the techniques of the 
           prior section.
           \begin{equation*}
                 B=\sequence{\colvec[r]{1 \\ 0},
                             \colvec[r]{-10 \\ -25}}
           \end{equation*}
           We took the first basis vector so that it is in
           the null space of $t^2$ but is not in the null space of $t$.
           The second basis vector is the image of the first under $t$.
        \partsitem The characteristic polynomial of this matrix
           is \( c(x)=(x+1)^2 \), so it is a single-eigenvalue matrix.
           (That is, the generalized null space of $t+1$ is the entire
           space.) 
           We have
           \begin{equation*}
             \nullspace{t+1}=\set{\colvec{2y/3 \\ y}\suchthat
                                     y\in\C} 
             \qquad
             \nullspace{(t+1)^2}=\C^2 
           \end{equation*}
           and so the action of $t+1$ on
           an associated string basis is 
           $\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero$.
           Thus, 
           \begin{equation*}
             N_{-1}
             =
             \begin{mat}[r]
                0  &0  \\
                1  &0 
             \end{mat}
           \end{equation*}
           the Jordan form of T is
           \begin{equation*}
             J_{-1}=N_{-1}+-1\cdot I
             =
             \begin{mat}[r]
               -1  &0  \\
                1  &-1
             \end{mat}
           \end{equation*}
           and choosing vectors from the above null spaces gives
           this string basis (other choices are possible).
           \begin{equation*}
             B=\sequence{\colvec[r]{1 \\ 0},
                         \colvec[r]{6 \\ 9}}
           \end{equation*}
        \partsitem The characteristic polynomial 
            \( c(x)=(1-x)(4-x)^2=-1\cdot (x-1)(x-4)^2 \) has two roots
            and they are the eigenvalues $\lambda_1=1$ and $\lambda_2=4$.

            We handle the two eigenvalues separately.
            For $\lambda_1$, the calculation of the powers of $T-1I$
            yields
            \begin{equation*}
              \nullspace{t-1}=\set{\colvec{0 \\ y \\ 0}
                                      \suchthat y\in\C}
            \end{equation*}
            and the null space of $(t-1)^2$ is the same.
            Thus this set is the generalized null space 
            $\gennullspace{t-1}$.
            The nullities show that the action of the restriction of $t-1$ 
            to the generalized null space on a string basis
            is  $\vec{\beta}_1\mapsto\zero$.

            A similar calculation for $\lambda_2=4$ gives these null spaces.
            \begin{equation*}
              \nullspace{t-4}=\set{\colvec{0 \\ z \\ z}
                                      \suchthat z\in\C}
              \qquad
              \nullspace{(t-4)^2}=\set{\colvec{y-z \\ y \\ z}
                                          \suchthat y,z\in\C}
            \end{equation*}
            (The null space of $(t-4)^3$ is the same, as it must be because
            the power of the term associated with $\lambda_2=4$ in the
            characteristic polynomial is two, and so the restriction of
            $t-2$ to the generalized null space $\gennullspace{t-2}$
            is nilpotent of index at most two\Dash it takes at most
            two applications of $t-2$ for the null space to settle down.)
            The pattern of how the nullities rise tells us that
             the action of $t-4$ on an associated string basis 
            for $\gennullspace{t-4}$ is 
            $\vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero$.

            Putting the information for the two eigenvalues 
            together gives the Jordan form of the transformation $t$.
            \begin{equation*}
              \begin{mat}[r]
                1  &0  &0  \\
                0  &4  &0  \\
                0  &1  &4
              \end{mat}
            \end{equation*}
            We can take elements of the null spaces to get an appropriate
            basis.
            \begin{equation*}
              B=\cat{B_{1}}{B_4}=
               \sequence{\colvec[r]{0 \\ 1 \\ 0},
                          \colvec[r]{1 \\ 0 \\ 1},
                          \colvec[r]{0 \\ 5 \\ 5}}
            \end{equation*}
        \partsitem The characteristic polynomial is 
            \( c(x)=(-2-x)(4-x)^2=-1\cdot (x+2)(x-4)^2 \).

            For the eigenvalue $\lambda_{-2}$, calculation of the
            powers of $T+2I$ yields this.
            \begin{equation*}
              \nullspace{t+2}=\set{\colvec{z \\ z \\ z}
                                      \suchthat z\in\C}
            \end{equation*}
            The null space of $(t+2)^2$ is the same, and so 
            this is the generalized null space $\gennullspace{t+2}$.
            Thus the action of the restriction of $t+2$ to 
            $\gennullspace{t+2}$ on an associated
            string basis is $\vec{\beta}_1\mapsto\zero$.

            For $\lambda_2=4$, 
            computing the powers of $T-4I$ yields 
            \begin{equation*}
              \nullspace{t-4}=\set{\colvec{z \\ -z \\ z}
                                      \suchthat z\in\C} 
              \qquad
              \nullspace{(t-4)^2}=\set{\colvec{x \\ -z \\ z}
                                           \suchthat x,z\in\C}
            \end{equation*}
            and so the action of $t-4$ on a string basis for 
            $\gennullspace{t-4}$ is
            $\vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero$.

            Therefore the Jordan form is  
            \begin{equation*}
              \begin{mat}[r]
                -2  &0  &0  \\
                 0  &4  &0  \\
                 0  &1  &4
              \end{mat}
            \end{equation*}
            and a suitable basis is this.
            \begin{equation*}
              B=\cat{B_{-2}}{B_4}=
                \sequence{\colvec[r]{1 \\ 1 \\ 1},
                          \colvec[r]{0 \\ -1 \\ 1},
                          \colvec[r]{-1 \\ 1 \\ -1}}
            \end{equation*}
        \partsitem The characteristic polynomial of this
            matrix is \( c(x)=(2-x)^3=-1\cdot (x-2)^3 \).
            This matrix has only a single eigenvalue, $\lambda=2$.
            By finding the powers of $T-2I$ we have  
            \begin{equation*}
              \nullspace{t-2}=\set{\colvec{-y \\ y \\ 0}
                                      \suchthat y\in\C} 
              \qquad
              \nullspace{(t-2)^2}=\set{\colvec{-y-(1/2)z \\ y \\ z}
                                          \suchthat y,z\in\C}
            \end{equation*}
            and
            \begin{equation*} 
              \nullspace{(t-2)^3}=\C^3
            \end{equation*}
            and so 
            the action of $t-2$ on an
            associated string basis is
            $\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto
                   \vec{\beta}_3\mapsto\zero$.
            The Jordan form is this
            \begin{equation*}
                  \begin{mat}[r]
                    2  &0  &0  \\
                    1  &2  &0  \\
                    0  &1  &2
                  \end{mat}
            \end{equation*}
            and one choice of basis is this.
            \begin{equation*}
              B=\sequence{\colvec[r]{0 \\ 1 \\ 0},
                          \colvec[r]{7 \\ -9 \\ 4},
                          \colvec[r]{-2 \\ 2 \\ 0}}
            \end{equation*}
        \partsitem The characteristic polynomial
            \( c(x)=(1-x)^3=-(x-1)^3 \) has only a single root,
            so the matrix has only a single eigenvalue $\lambda=1$.
            Finding the powers of $T-1I$ 
            and calculating the null spaces
            \begin{equation*}
               \nullspace{t-1}=\set{\colvec{-2y+z \\ y \\ z}
                                      \suchthat y,z\in\C} 
               \qquad
               \nullspace{(t-1)^2}=\C^3 
            \end{equation*}
            shows that the action of the nilpotent map $t-1$ on a string
            basis is
            $\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero$ and
            $\vec{\beta}_3\mapsto\zero$.
            Therefore the Jordan form is
            \begin{equation*}
                  J=
                  \begin{mat}[r]
                    1  &0  &0  \\
                    1  &1  &0  \\
                    0  &0  &1
                  \end{mat}
            \end{equation*}
            and an appropriate basis (a string basis associated with
            $t-1$) is this.
            \begin{equation*}
              B=\sequence{\colvec[r]{0 \\ 1 \\ 0},
                          \colvec[r]{2 \\ -2 \\ -2},
                          \colvec[r]{1 \\ 0 \\ 1}}
            \end{equation*}
        \partsitem The characteristic polynomial is a bit large for by-hand
            calculation, but just manageable 
            \( c(x)=x^4-24x^3+216x^2-864x+1296=(x-6)^4 \).
            This is a single-eigenvalue map, so
            the transformation $t-6$ is nilpotent.
            The null spaces
            \begin{equation*}
               \nullspace{t-6}=\set{\colvec{-z-w \\ -z-w \\ z \\ w}
                                      \suchthat z,w\in\C} 
               \qquad
               \nullspace{(t-6)^2}=\set{\colvec{x \\ -z-w \\ z \\ w}
                                      \suchthat x,z,w\in\C} 
            \end{equation*}
            and
            \begin{equation*}
               \nullspace{(t-6)^3}=\C^4 
            \end{equation*}
            and the nullities
            show that the action of $t-6$ on a string basis is
            $\vec{\beta}_1\mapsto\vec{\beta}_2\mapsto
                   \vec{\beta}_3\mapsto\zero$ and
            $\vec{\beta}_4\mapsto\zero$.
            The Jordan form is
            \begin{equation*}
              \begin{mat}[r]
                6  &0  &0  &0  \\
                1  &6  &0  &0  \\
                0  &1  &6  &0  \\
                0  &0  &0  &6  \\
              \end{mat}
            \end{equation*}
            and finding a suitable string basis is routine.
            \begin{equation*}
              B=\sequence{\colvec[r]{0 \\ 0 \\ 0 \\ 1},
                          \colvec[r]{2 \\ -1 \\ -1 \\ 2},
                          \colvec[r]{3 \\ 3 \\ -6 \\ 3},
                          \colvec[r]{-1 \\ -1 \\ 1 \\ 0}}
            \end{equation*}
      \end{exparts}  
    \end{answer}
  \recommended \item
    Find all possible Jordan forms of a transformation with characteristic
    polynomial \( (x-1)^2(x+2)^2  \).
    \begin{answer}
      There are two eigenvalues, $\lambda_1=-2$ and $\lambda_2=1$.
      The restriction of $t+2$ to 
      $\gennullspace{t+2}$ could have either of these actions 
      on an associated string basis.
      \begin{equation*}
        \vec{\beta}_1\mapsto\vec{\beta}_2\mapsto\zero
        \qquad
        \begin{array}[t]{l}
          \vec{\beta}_1\mapsto\zero  \\
          \vec{\beta}_2\mapsto\zero 
        \end{array}
      \end{equation*}
      The restriction of $t-1$ to 
      $\gennullspace{t-1}$ could have either of these actions 
      on an associated string basis.
      \begin{equation*}
        \vec{\beta}_3\mapsto\vec{\beta}_4\mapsto\zero
        \qquad
        \begin{array}[t]{l}
          \vec{\beta}_3\mapsto\zero  \\
          \vec{\beta}_4\mapsto\zero 
        \end{array}
      \end{equation*}
      In combination, that makes four possible Jordan forms,
      the two first actions, the second and first, the first and second, and
      the two second actions.
      \begin{equation*}
        \begin{mat}[r]
          -2  &0  &0  &0  \\
           1  &-2 &0  &0  \\
           0  &0  &1  &0  \\
           0  &0  &1  &1
        \end{mat}
        \quad
        \begin{mat}[r]
          -2  &0  &0  &0  \\
           0  &-2 &0  &0  \\
           0  &0  &1  &0  \\
           0  &0  &1  &1
        \end{mat}
        \quad
        \begin{mat}[r]
          -2  &0  &0  &0  \\
           1  &-2 &0  &0  \\
           0  &0  &1  &0  \\
           0  &0  &0  &1
        \end{mat}
        \quad
        \begin{mat}[r]
          -2  &0  &0  &0  \\
           0  &-2 &0  &0  \\
           0  &0  &1  &0  \\
           0  &0  &0  &1
        \end{mat}
     \end{equation*}  
    \end{answer}
  \item 
    Find all possible Jordan forms of a transformation with characteristic
    polynomial \( (x-1)^3(x+2) \).
    \begin{answer}
     The restriction of $t+2$ to 
     $\gennullspace{t+2}$ can have only the action
     $\vec{\beta}_1\mapsto\zero$.
     The restriction of $t-1$ to $\gennullspace{t-1}$ could have any
     of these three actions on an associated string basis. 
     \begin{equation*}
        \vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\vec{\beta}_4\mapsto\zero
        \qquad
        \begin{array}[t]{l}
          \vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero  \\
          \vec{\beta}_4\mapsto\zero 
        \end{array}
        \qquad
        \begin{array}[t]{l}
          \vec{\beta}_2\mapsto\zero  \\
          \vec{\beta}_3\mapsto\zero  \\
          \vec{\beta}_4\mapsto\zero 
        \end{array}
     \end{equation*}
     Taken together there are three possible Jordan forms,
     the one arising from the first action by $t-1$ (along with the only
     action from $t+2$), the one arising from the second action, and
     the one arising from the third action.
     \begin{equation*}
       \begin{mat}[r]
         -2  &0  &0  &0  \\
          0  &1  &0  &0  \\
          0  &1  &1  &0  \\
          0  &0  &1  &1
       \end{mat}
       \quad
       \begin{mat}[r]
         -2  &0  &0  &0  \\
          0  &1  &0  &0  \\
          0  &1  &1  &0  \\
          0  &0  &0  &1
       \end{mat}
       \quad
       \begin{mat}[r]
         -2  &0  &0  &0  \\
          0  &1  &0  &0  \\
          0  &0  &1  &0  \\
          0  &0  &0  &1
       \end{mat}
     \end{equation*}
    \end{answer}
  \recommended \item
    Find all possible Jordan forms of a transformation with characteristic
    polynomial \( (x-2)^3(x+1) \) and minimal polynomial \( (x-2)^2(x+1) \).
    \begin{answer}
      The action of $t+1$ on a string basis for $\gennullspace{t+1}$
      must be $\vec{\beta}_1\mapsto\zero$. 
      Because of the power of \( x-2 \) in the minimal polynomial, a
      string basis for $t-2$ has length two and so
      the action of \( t-2 \) on \( \gennullspace{t-2} \)
      must be of this form.
      \begin{equation*}
        \begin{array}[t]{l}
          \vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero  \\
          \vec{\beta}_4\mapsto\zero 
        \end{array}        
      \end{equation*}
      Therefore there is only one Jordan form that is possible.
      \begin{equation*}
          \begin{mat}[r]
            -1  &0  &0  &0  \\
             0  &2  &0  &0  \\
             0  &1  &2  &0  \\
             0  &0  &0  &2
          \end{mat}
       \end{equation*}
     \end{answer}
  \item 
    Find all possible Jordan forms of a transformation with characteristic
    polynomial \( (x-2)^4(x+1) \) and minimal polynomial \( (x-2)^2(x+1) \).
    \begin{answer}
      There are two possible Jordan forms.
      The action of $t+1$ on a string basis for $\gennullspace{t+1}$
      must be $\vec{\beta}_1\mapsto\zero$.
      There are two actions for $t-2$ on a string basis for
      $\gennullspace{t-2}$ that are possible with this characteristic 
      polynomial and minimal polynomial.
      \begin{equation*}
        \begin{array}[t]{l}
          \vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero  \\
          \vec{\beta}_4\mapsto\vec{\beta}_5\mapsto\zero  
        \end{array}        
        \qquad
        \begin{array}[t]{l}
          \vec{\beta}_2\mapsto\vec{\beta}_3\mapsto\zero  \\
          \vec{\beta}_4\mapsto\zero                      \\
          \vec{\beta}_5\mapsto\zero                      
        \end{array}        
      \end{equation*}
      The resulting Jordan form matrices are these. 
      \begin{equation*}
        \begin{mat}[r]
          -1  &0  &0  &0  &0  \\
           0  &2  &0  &0  &0  \\
           0  &1  &2  &0  &0  \\
           0  &0  &0  &2  &0  \\
           0  &0  &0  &1  &2
        \end{mat}
        \qquad
        \begin{mat}[r]
          -1  &0  &0  &0  &0  \\
           0  &2  &0  &0  &0  \\
           0  &1  &2  &0  &0  \\
           0  &0  &0  &2  &0  \\
           0  &0  &0  &0  &2
        \end{mat}
     \end{equation*}  
    \end{answer}
  \recommended \item Diagonalize these.
    \begin{exparts*}
       \partsitem \( \begin{mat}[r]
                  1  &1  \\
                  0  &0
                \end{mat}  \)
       \partsitem \( \begin{mat}[r]
                  0  &1  \\
                  1  &0
                \end{mat}  \)
    \end{exparts*}
    \begin{answer}
      \begin{exparts}
        \partsitem The characteristic polynomial is \( c(x)=x(x-1) \).
          For $\lambda_1=0$ we have
          \begin{equation*}
            \nullspace{t-0}=\set{\colvec{-y \\ y}
                                 \suchthat y\in\C }  
          \end{equation*} 
          (of course, the null space of $t^2$ is the same).
          For $\lambda_2=1$,
          \begin{equation*}
            \nullspace{t-1}=\set{\colvec{x \\ 0}
                                     \suchthat x\in\C }  
          \end{equation*}
          (and the null space of $(t-1)^2$ is the same).
          We can take this basis
          \begin{equation*}
            B=\sequence{\colvec[r]{1 \\ -1},\colvec[r]{1 \\ 0}}
          \end{equation*}
          to get the diagonalization.
          \begin{equation*}
            \begin{mat}[r]
              1  &1  \\
             -1  &0
            \end{mat}^{-1}
            \begin{mat}[r]
              1  &1  \\
              0  &0
            \end{mat}
            \begin{mat}[r]
              1  &1  \\
             -1  &0
            \end{mat}
            =
            \begin{mat}[r]
              0  &0  \\
              0  &1
            \end{mat}
          \end{equation*}
        \partsitem The characteristic polynomial is 
          \( c(x)=x^2-1=(x+1)(x-1) \).
          For $\lambda_1=-1$,
          \begin{equation*}
            \nullspace{t+1}=\set{\colvec{-y \\ y}
                                    \suchthat y\in\C } 
          \end{equation*}
          and the null space of $(t+1)^2$ is the same.
          For $\lambda_2=1$ 
          \begin{equation*}
            \nullspace{t-1}=\set{\colvec{y \\ y}
                                    \suchthat y\in\C } 
          \end{equation*}
          and the null space of $(t-1)^2$ is the same.
          We can take this basis
          \begin{equation*}
            B=\sequence{\colvec[r]{1 \\ -1},\colvec[r]{1 \\ 1}}
          \end{equation*}
          to get a diagonalization.
          \begin{equation*}
            \begin{mat}[r]
              1  &1  \\
              1  &-1
            \end{mat}^{-1}
            \begin{mat}[r]
              0  &1  \\
              1  &0
            \end{mat}
            \begin{mat}[r]
              1   &1  \\
              -1  &1
            \end{mat}
            =
            \begin{mat}[r]
              -1  &0  \\
              0   &1
            \end{mat}
          \end{equation*}
      \end{exparts}  
     \end{answer}
  \recommended \item 
    Find the Jordan matrix representing the differentiation
    operator on \( \polyspace_3 \).
    \begin{answer}
      The transformation $\map{d/dx}{\polyspace_3}{\polyspace_3}$ 
      is nilpotent.
      Its action on the basis \( B=\sequence{x^3,3x^2,6x,6} \)
      is $x^3\mapsto 3x^2\mapsto 6x\mapsto 6\mapsto 0$.
      Its Jordan form is its canonical form as a nilpotent matrix.
      \begin{equation*}
         J=
         \begin{mat}[r]
           0  &0  &0  &0  \\
           1  &0  &0  &0  \\
           0  &1  &0  &0  \\
           0  &0  &1  &0
         \end{mat}
      \end{equation*}
    \end{answer}
   \recommended \item 
      Decide if these two are similar.
      \begin{equation*}
         \begin{mat}[r]
            1  &-1 \\
            4  &-3 \\
         \end{mat}
         \qquad
         \begin{mat}[r]
           -1  &0  \\
            1  &-1 \\
         \end{mat}
      \end{equation*}
      \begin{answer}
        Yes.
        Each has the characteristic polynomial $(x+1)^2$.
        Calculations of the powers of $T_1+1\cdot I$ and 
        $T_2+1\cdot I$ gives these two.
        \begin{equation*}
          \nullspace{t_1+1}=\set{\colvec{y/2 \\ y} \suchthat y\in\C}
          \qquad
          \nullspace{t_2+1}=\set{\colvec{0 \\ y} \suchthat y\in\C}
        \end{equation*}
        (Of course, for each the null space of the square is 
        the entire space.)
        The way that the nullities rise shows that each is  
        similar to this Jordan form matrix
        \begin{equation*}
           \begin{mat}[r]
             -1  &0  \\
              1  &-1 \\
           \end{mat}
        \end{equation*}
        and they are therefore similar to each other.  
      \end{answer}
  \item 
     Find the Jordan form of this matrix.
     \begin{equation*}
        \begin{mat}[r]
           0  &-1  \\
           1  &0
        \end{mat}
     \end{equation*}
     Also give a Jordan basis.
     \begin{answer}
       Its characteristic polynomial is
       \( c(x)=x^2+1 \) which has complex roots
       \( x^2+1=(x+i)(x-i) \).
       Because the roots are distinct,
       the matrix is diagonalizable and its Jordan form is that
       diagonal matrix. 
       \begin{equation*}
         \begin{mat}[r]
           -i  &0  \\
            0  &i
         \end{mat}
       \end{equation*}  
       To find an associated basis we compute the null spaces.
       \begin{equation*}
         \nullspace{t+i}=\set{\colvec{-iy \\ y}
                                        \suchthat y\in\C} 
         \qquad
         \nullspace{t-i}=\set{\colvec{iy \\ y}
                                        \suchthat y\in\C} 
       \end{equation*}
       For instance, 
       \begin{equation*}
         T+i\cdot I=
         \begin{mat}[r]
           i  &-1  \\
           1  &i
         \end{mat}
       \end{equation*}
       and so we get a description of the null space of $t+i$ by solving
       this linear system.
       \begin{equation*}
         \begin{linsys}{2}
           ix  &-  &y  &=  &0  \\
            x  &+  &iy &=  &0
         \end{linsys}
         \grstep{i\rho_1+\rho_2}
         \begin{linsys}{2}
           ix  &-  &y  &=  &0  \\
               &   &0  &=  &0
         \end{linsys}
       \end{equation*}
       (To change the relation $ix=y$ so that the leading variable $x$ is
       expressed in terms of the free variable $y$, we can multiply both
       sides by $-i$.)

       As a result, one such basis is this.
       \begin{equation*}
         B=\sequence{\colvec[r]{-i \\ 1},
                     \colvec[r]{i \\ 1}}
       \end{equation*}  
     \end{answer}
  \item 
    How many similarity classes are there for \( \nbyn{3} \) matrices
    whose only eigenvalues are \( -3 \) and \( 4 \)?
    \begin{answer}
     We can count the possible classes by counting the possible
     canonical representatives, that is, the possible Jordan form matrices.
     The characteristic polynomial must be either $c_1(x)=(x+3)^2(x-4)$
     or $c_2(x)=(x+3)(x-4)^2$.
     In the $c_1$ case there are two possible actions 
     of $t+3$ on a string basis for $\gennullspace{t+3}$.
     \begin{equation*}
       \vec{\beta}_1\mapsto\vec{\beta}_2\mapsto \zero
       \qquad
       \begin{array}[t]{l}
         \vec{\beta}_1\mapsto\zero \\
         \vec{\beta}_2\mapsto\zero 
       \end{array}
     \end{equation*}
     There are two associated Jordan form matrices.
     \begin{equation*}
        \begin{mat}[r]
          -3  &0  &0  \\
           1  &-3 &0  \\
           0  &0  &4
        \end{mat}
        \qquad
        \begin{mat}[r]
          -3  &0  &0  \\
           0  &-3 &0  \\
           0  &0  &4
        \end{mat}
      \end{equation*}
      Similarly there are two Jordan form matrices that could arise
      out of $c_2$.
      \begin{equation*}
        \begin{mat}[r]
          -3  &0  &0  \\
           0  &4  &0  \\
           0  &1  &4
        \end{mat}
        \qquad
        \begin{mat}[r]
          -3  &0  &0  \\
           0  &4  &0  \\
           0  &0  &4
        \end{mat}
     \end{equation*}  
     So in total there are four possible Jordan forms.
    \end{answer}
  \recommended \item
    Prove that a matrix is diagonalizable if and only if its minimal
    polynomial has only linear factors.
    \begin{answer}
       Jordan form is unique.
       A diagonal matrix is in Jordan form.
       Thus the Jordan form of a diagonalizable matrix is its diagonalization.
       If the minimal polynomial has factors to some power higher than one
       then the Jordan form has subdiagonal \( 1 \)'s, and so is not
       diagonal.  
     \end{answer}
  \item 
    Give an example of a linear transformation on a vector
    space that has no non-trivial invariant subspaces.
    \begin{answer}
      One example is the transformation of \( \C \) that
       sends \( x \) to \( -x \).  
     \end{answer}
  \item 
    Show that a subspace is \( t-\lambda_1 \) invariant if and only if
    it is \( t-\lambda_2 \) invariant.
    \begin{answer}
      Apply \nearbylemma{le:tInvIfftMinLambdaInv} twice;
      the subspace is $t-\lambda_1$~invariant if and only if it is 
      $t$~invariant, which in turn holds if and only if it is 
      $t-\lambda_2$~invariant.  
    \end{answer}
  \item 
     Prove or disprove: two \( \nbyn{n} \) matrices are
     similar if and only if they have the same characteristic and
     minimal polynomials.
     \begin{answer}
       False; these two $\nbyn{4}$ matrices each have $c(x)=(x-3)^4$
       and $m(x)=(x-3)^2$.
       \begin{equation*}
          \begin{mat}[r]
             3  &0  &0  &0  \\
             1  &3  &0  &0  \\
             0  &0  &3  &0  \\
             0  &0  &1  &3
          \end{mat}
          \quad
          \begin{mat}[r]
             3  &0  &0  &0  \\
             1  &3  &0  &0  \\
             0  &0  &3  &0  \\
             0  &0  &0  &3
          \end{mat}
       \end{equation*} 
     \end{answer}
  \item 
    The \definend{trace}\index{trace}\index{matrix!trace} 
    of a square matrix is the sum of its diagonal entries.
    \begin{exparts}
       \partsitem Find the formula for the characteristic polynomial of
         a $\nbyn{2}$ matrix.
       \partsitem Show that 
         trace is invariant under similarity, and so we can sensibly
         speak of the `trace of a map'.\index{linear map!trace}
         (\textit{Hint:}  see the prior item.)
       \partsitem Is trace invariant under matrix equivalence?
       \partsitem Show that the trace of a map is the sum of its eigenvalues
         (counting multiplicities).
       \partsitem Show that the trace of a nilpotent map is zero.
         Does the converse hold?
    \end{exparts}
    \begin{answer}
      \begin{exparts}
         \partsitem The characteristic polynomial is this. 
           \begin{equation*}
             \begin{vmat}
               a-x  &b  \\
               c  &d-x
             \end{vmat}
             =(a-x)(d-x)-bc=ad-(a+d)x+x^2-bc
             =x^2-(a+d)x+(ad-bc)
           \end{equation*}
           Note that the determinant appears as the constant term.
         \partsitem Recall that the characteristic polynomial
            \( \deter{T-xI} \) is invariant under similarity.
            Use the permutation expansion formula to show that the trace
            is the negative of the coefficient of \( x^{n-1} \).
         \partsitem No, there are matrices $T$ and $S$ that are
            equivalent $S=PTQ$ (for some nonsingular $P$ and $Q$)
            but that have different traces.
            An easy example is this.
            \begin{equation*}
               PTQ=
               \begin{mat}[r]
                  2  &0  \\
                  0  &1
               \end{mat}
               \begin{mat}[r]
                  1  &0  \\
                  0  &1
               \end{mat}
               \begin{mat}[r]
                  1  &0  \\
                  0  &1
               \end{mat}
               =
               \begin{mat}[r]
                  2  &0  \\
                  0  &1
               \end{mat}
            \end{equation*}
            Even easier examples using $\nbyn{1}$ matrices are possible.
         \partsitem Put the matrix in Jordan form.
            By the first item, the trace is unchanged.
         \partsitem The first part is easy; use the third item.
            The converse does not hold:~this matrix
            \begin{equation*}
               \begin{mat}[r]
                  1  &0  \\
                  0  &-1
               \end{mat}
            \end{equation*}
            has a trace of zero but is not nilpotent.
       \end{exparts}  
     \end{answer}
  \item 
    To use \nearbydefinition{def:invariant} to check whether a subspace
    is $t$~invariant, we seemingly have to check all of the infinitely many
    vectors in a (nontrivial) subspace to see if they satisfy the condition.
    Prove that a subspace is \( t \)~invariant if and only if its subbasis
    has the property that for all of its elements, $t(\vec{\beta})$ is in 
    the subspace.
    \begin{answer}
      Suppose that \( B_M \) is a basis for a subspace \( M \) of some vector
      space.
      Implication one way is clear; if \( M \) is \( t \) invariant then
      in particular, if \( \vec{m}\in B_M \) then \( t(\vec{m})\in M \).
      For the other implication, let
      \( B_M=\sequence{\vec{\beta}_1,\dots,\vec{\beta}_q} \) and note that
      \( t(\vec{m})=t(m_1\vec{\beta}_1+\dots+m_q\vec{\beta}_q)
                   =m_1t(\vec{\beta}_1)+\dots+m_qt(\vec{\beta}_q) \)
      is in \( M \) as any subspace is closed under linear
      combinations. 
    \end{answer}
  \recommended \item 
    Is \( t \) invariance preserved under intersection?
    Under union?
    Complementation?
    Sums of subspaces?
    \begin{answer}
      Yes, the intersection 
      of $t$ invariant subspaces is $t$~invariant.
      Assume that \( M \) and \( N \) are \( t \) invariant.
      If \( \vec{v}\in M\intersection N \) then \( t(\vec{v})\in M \)
      by the invariance of \( M \) and \( t(\vec{v})\in N \) by the
      invariance of \( N \).

      Of course, the union of two subspaces need not be a subspace
      (remember that the $x$-\hbox{} and $y$-axes are subspaces of the plane
      $\Re^2$ but the union of the two axes fails to be closed
      under vector addition; for instance it does not contain
      $\vec{e}_1+\vec{e}_2$.)
      However, the union of invariant subsets is an invariant subset; if
      \( \vec{v}\in M\union N \) then \( \vec{v}\in M \) or \( \vec{v}\in N \)
      so \( t(\vec{v})\in M \) or \( t(\vec{v})\in N \).

      No, the complement of an invariant subspace need not be invariant.
      Consider the subspace
      \begin{equation*}
        \set{\colvec{x \\ 0}\suchthat x\in\C}
      \end{equation*}
      of \( \C^2 \) under the zero transformation.

      Yes, the sum of two invariant subspaces is invariant.
      The check is easy.  
    \end{answer}
  \item 
     Give a way to order the Jordan blocks if some of the eigenvalues
     are complex numbers.
     That is, suggest a reasonable ordering for the complex numbers.
     \begin{answer}
       One such ordering is the \definend{dictionary ordering}.
       Order by the real component first, then by the coefficient of \( i \).
       For instance, \( 3+2i<4+1i \) but \( 4+1i<4+2i \).  
     \end{answer}
  \item
    Let \( \polyspace_j(\Re) \) be the vector space over
    the reals of degree \( j \) polynomials.
    Show that if \( j\le k \) then \( \polyspace_j(\Re) \) is an invariant
    subspace of \( \polyspace_k(\Re) \) under the differentiation operator.
    In \( \polyspace_7(\Re) \), does any of \( \polyspace_0(\Re) \),
    \ldots, \( \polyspace_6(\Re) \) have an invariant complement?
    \begin{answer}
     The first half is easy\Dash the derivative of any real polynomial is
      a real polynomial of lower degree.
      The answer to the second half is `no'; any complement of
      \( \polyspace_j(\Re) \) must include a polynomial of degree \( j+1 \),
      and the derivative of that polynomial is in \( \polyspace_j(\Re)\).  
     \end{answer}
  \item
    In \( \polyspace_n(\Re) \), the vector space (over the
    reals) of degree \( n \) polynomials,
    \begin{equation*}
      \mathcal{E}=
      \set{p(x)\in\polyspace_n(\Re)\suchthat p(-x)=p(x) \text{\ for all\ }x}
    \end{equation*}
    and
    \begin{equation*}
      \mathcal{O}=
      \set{p(x)\in\polyspace_n(\Re)\suchthat p(-x)=-p(x) \text{\ for all\ }x}
    \end{equation*}
    are the \definend{even}\index{even polynomials}\index{polynomial!even} 
    and the \definend{odd}\index{even polynomials}\index{polynomial!even}
     polynomials; \( p(x)=x^2 \) is
    even while \( p(x)=x^3 \) is odd.
    Show that they are subspaces.
    Are they complementary?
    Are they invariant under the differentiation transformation?
    \begin{answer}
      For the first half, show that each is a subspace and then observe
      that any polynomial can be uniquely
       written as the sum of even-powered and odd-powered terms (the
       zero polynomial is both).
       The answer to the second half is `no': \( x^2 \) is even while
       \( 2x \) is odd.  
     \end{answer}
  \item 
    \nearbylemma{le:InvCompSubspSplitTrans} says that if \( M \) and
    \( N \) are
    invariant complements then \( t \) has a representation in the given
    block form (with respect to the same ending as starting basis, of course).
    Does the implication reverse?
    \begin{answer}
      Yes.
      If \( \rep{t}{B,B} \) has the given block form, take \( B_M \) to
      be the first \( j \) vectors of \( B \), where \( J \) is the
      \( \nbyn{j} \) upper left submatrix.
      Take \( B_N \) to be the remaining \( k \) vectors in \( B \).
      Let \( M \) and \( N \) be the spans of \( B_M \) and \( B_N \).
      Clearly \( M \) and \( N \) are complementary.
      To see \( M \) is invariant (\( N \) works the same way), represent
      any \( \vec{m}\in M \) with respect to \( B \), note the last
      \( k \) components are zeroes, and multiply by the given block
      matrix.
      The final \( k \) components of the result are zeroes, so that
      result is again in \( M \). 
     \end{answer}
   \item 
     A matrix \( S \) is the \definend{square root}\index{square root}
     of another \( T \) if \( S^2=T \).
     Show that any nonsingular matrix has a square root.
     \begin{answer}
         Put the matrix in Jordan form.
         By non-singularity, there are no zero eigenvalues on the diagonal.
         Ape this example:
          \begin{equation*}
             \begin{mat}[r]
                9  &0  &0 \\
                1  &9  &0 \\
                0  &0  &4
             \end{mat}
             =
             \begin{mat}[r]
                3  &0  &0 \\
               1/6 &3  &0 \\
                0  &0  &2
             \end{mat}^2
          \end{equation*}
         to construct a square root.
         Show that it holds up under similarity: if \( S^2=T \) then
         \( (PSP^{-1})(PSP^{-1})=PTP^{-1} \). 
    \end{answer}
\index{Jordan form|)}
\end{exercises}