|
In linear algebra, a square matrix A is called diagonalizable if it is similar to a diagonal matrix, i.e. if there exists an invertible matrix P such that P -1AP is a diagonal matrix. If V is a finite-dimensional vector space, then a linear map T : V → V is called diagonalizable if there exists a basis of V with respect to which T is represented by a diagonal matrix. Diagonalization is the process of finding a corresponding diagonal matrix for a diagonalizable matrix or linear map.
Diagonalizable matrices and maps are of interest because diagonal matrices are especially easy to handle: their eigenvalues and eigenvectors are known and one can raise a diagonal matrix to a power by simply raising the diagonal entries to that same power.
The fundamental fact about diagonalizable maps and matrices is expressed by the following:
- An n-by-n matrix A over the field F is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to n, which is the case if and only if there exists a basis of Fn consisting of eigenvectors of A. If such a basis has been found, one can form the matrix P having these basis vectors as columns, and P -1AP will be a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of A.
- A linear map T : V → V is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to dim(V), which is the case if and only if there exists a basis of V consisting of eigenvectors of T. With respect to such a basis, T will be represented by a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of T.
Another characterization: A matrix or linear map is diagonalizable over the field F if and only if its minimal polynomial is a product of distinct linear factors over F.
The following sufficient (but not necessary) condition is often useful.
- An n-by-n matrix A is diagonalizable over the field F if it has n distinct eigenvalues in F, i.e. if its characteristic polynomial has n distinct roots in F.
- A linear map T : V → V with n=dim(V) is diagonalizable if it has n distinct eigenvalues, i.e. if its characteristic polynomial has n distinct roots in F.
As a rule of thumb, over C almost every matrix is diagonalizable. More precisely: the set of complex n-by-n matrices that are not diagonalizable over C, considered as a subset of Cn×n, is a null set with respect to the Lebesgue measure. One can also say that the diagonalizable matrices form a dense subset with respect the Zariski topology: the complement lies inside the set where the discriminant of the characteristic polynomial vanishes, which is a hypersurface. From that follows also density in the usual (strong) topology given by a norm.
The same is not true over R. As n increases, it becomes (in some sense) less and less likely that a randomly selected real matrix is diagonalizable over R.
Example
Consider a matrix
- <math>A=\begin{pmatrix}
1 & 2 & 0 \\
0 & 3 & 0 \\
2 & -4 & 2 \end{pmatrix}<math>
This matrix has eigenvalues
- <math>\lambda_1 = 3, \quad \lambda_2 = 2, \quad \lambda_3= 1<math>
Writing the eigenspaces Eλi = ker (λiI - A), we find that the dim Eλ1=1, dim Eλ2=1, and dim Eλ3=1. So A is diagonalizable.
By directly calculating the kernels above, we find
- Eλ1 = span (−1, −1, 2)T
- Eλ2 = span (0,0,1)T
- Eλ3 = span (−1, 0, 2)T
Now, (−1, −1, 2)T, (0,0,1)T, and (−1, 0, 2)T are eigenvectors of A, so we can form a matrix with its columns as these eigenvectors, call it P:
- <math>P=
\begin{pmatrix}
-1 & 0 & -1 \\
-1 & 0 & 0 \\
2 & 1 & 2 \end{pmatrix}<math>
P diagonalizes A - observe
- <math>P^{-1}AP=
\begin{pmatrix}
-1 & 0 & -1 \\
-1 & 0 & 0 \\
2 & 1 & 2 \end{pmatrix}^{-1}
\begin{pmatrix}
1 & 2 & 0 \\
0 & 3 & 0 \\
2 & -4 & 2 \end{pmatrix}
\begin{pmatrix}
-1 & 0 & -1 \\
-1 & 0 & 0 \\
2 & 1 & 2 \end{pmatrix}<math>
- <math>=\begin{pmatrix}
0 & -1 & 0 \\
2 & 0 & 1 \\
-1 & -1 & 0 \end{pmatrix}
\begin{pmatrix}
1 & 2 & 0 \\
0 & 3 & 0 \\
2 & -4 & 2 \end{pmatrix}
\begin{pmatrix}
-1 & -1 & 2 \\
0 & 0 & 1 \\
-1 & 0 & 2 \end{pmatrix}<math>
- <math>=\begin{pmatrix}
0 & -1 & 0 \\
2 & 0 & 1 \\
-1 & -1 & 0 \end{pmatrix}
\begin{pmatrix}
-3 & 0 & 1 \\
-3 & 0 & 0 \\
6 & 2 & 2 \end{pmatrix}<math>
- <math>=\begin{pmatrix}
3 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 1\end{pmatrix}<math>
as required.
An application
Diagonalization can be used
to compute the powers of a matrix A efficiently, provided the matrix is diagonalizable. Suppose we have found that
- <math>P^{-1}AP = D<math>
is a diagonal matrix. Then
- <math>A^k = (PDP^{-1})^k = PD^kP^{-1}<math>
and the latter is easy to calculate since it only involves the powers of a diagonal matrix.
For example, consider the following matrix:
- <math>M =\begin{bmatrix}a & b-a \\ 0 &b \end{bmatrix}.<math>
Calculating the various powers of M reveals a surprising pattern:
- <math>
M^2 = \begin{bmatrix}a^2 & b^2-a^2 \\ 0 &b^2 \end{bmatrix},\quad
M^3 = \begin{bmatrix}a^3 & b^3-a^3 \\ 0 &b^3 \end{bmatrix},\quad
M^4 = \begin{bmatrix}a^4 & b^4-a^4 \\ 0 &b^4 \end{bmatrix},\quad \ldots
<math>
The above phenomenon can be explained by diagonalizing M. To accomplish this, we need a basis of R2 consisting of eigenvectors
of M. One such eigenvector basis is given by
- <math>\mathbf{u}=\begin{bmatrix} 1 \\ 0 \end{bmatrix}=\mathbf{e}_1,\quad
\mathbf{v}=\begin{bmatrix} 1 \\ 1 \end{bmatrix}=\mathbf{e}_1+\mathbf{e}_2,<math>
where ei denotes the standard basis of Rn.
The reverse change of basis is given by
- <math> \mathbf{e}_1 = \mathbf{u},\qquad \mathbf{e}_2 = \mathbf{v}-\mathbf{u}.<math>
Straighforward calculations show that
- <math>M\mathbf{u} = a\mathbf{u},\qquad M\mathbf{v}=b\mathbf{v}.<math>
Thus, a and b are the eigenvalues corresponding to u and v, respectively.
By linearity of matrix multiplication, we have that
- <math> M^n \mathbf{u} = a^n\, \mathbf{u},\qquad M^n \mathbf{v}=b^n\,\mathbf{v}.<math>
Switching back to the standard basis, we have
- <math> M^n \mathbf{e}_1 = M^n \mathbf{u} = a^n \mathbf{e}_1,<math>
- <math> M^n \mathbf{e}_2 = M^n (\mathbf{v}-\mathbf{u}) = b^n \mathbf{v} - a^n\mathbf{a} = (b^n-a^n) \mathbf{e}_1+b^n\mathbf{e}_2.<math>
The preceding relations, expressed in matrix form, are
- <math>
M^n = \begin{bmatrix}a^n & b^n-a^n \\ 0 &b^n \end{bmatrix},
<math>
thereby explaining the above phenomenon.
See also
|