|
In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of iterative methods. Arnoldi finds the eigenvalues of general (possibly non-Hermitian) matrices; an analogous method for Hermitian matrices is the Lanczos iteration. The Arnoldi iteration was invented by W. E. Arnoldi in 1951.
The term iterative method, used to describe Arnoldi, can perhaps be somewhat confusing. Note that all general eigenvalue algorithms must be iterative. This is not what is referred to when we say Arnoldi is an iterative method. Rather, Arnoldi belongs to a class of linear algebra algorithms (based on the idea of Krylov subspaces) that give a partial result after a relatively small number of iterations. This is in contrast to so-called direct methods, which must complete to give any useful results.
Krylov subspaces and the power iteration
Let <math>A<math> be a <math>m \times m<math> matrix, with <math>m<math> eigenvalues (counted with multiplicity).
An intuitive, albeit impractical method for finding an eigenvalue (specifically the largest eigenvalue) of <math>A<math> is the power iteration. Starting with a initial random vector b, calculate <math>Ab, A^{2}b, A^{3}b, \ldots<math>, until the normalized result converges. Say this takes <math>n-1<math> steps. Then our result, <math>A^{n-1}b<math> is a good approximation of the eigenvector corresponding to the largest eigenvalue, <math>\lambda_{1}<math>. Why? Let <math>\lambda_{1}, \lambda_{2}, \ldots, \lambda{m}<math> be the m eigenvalues (counted with multiplicity) of A in nondecreasing order, and let <math>v_{1}, v_{2}, \ldots, v_{m}<math> be the corresponding eigenvectors. The initial vector b can be written:
- <math>b = c_{1}v_{1} + c_{2}v_{2} + \ldots + c_{m}v_{m}<math>
With probability 1, <math>c_{1} \neq 0<math>. Now,
- <math>\begin{matrix}A^{k}b & = & c_{1}A^{k}v_{1} + c_{2}A^{k}v_{2} + \ldots + c_{m}A^{k}v_{m} \\
& = & c_{1}\lambda_{1}^{k}v_{1} + c_{2}\lambda_{2}^{k}v_{2} + \ldots + c_{m}\lambda_{m}^{k}v_{m} \\
& = & c_{1}\lambda_{1}^{k} \left( v_{1} + \frac{c_{2}}{c_{1}}\left(\frac{\lambda_{2}}{\lambda_{1}}\right)^{k}v_{2} + \ldots + \frac{c_{m}}{c_{1}}\left(\frac{\lambda_{m}}{\lambda_{1}}\right)^{k}v_{m}\right) \end{matrix}<math>
For simplicity, assume <math>\lambda_{1} > \lambda_{2} \geq \ldots \geq \lambda_{m}<math> (the degenerate case <math>\lambda_{1} = \lambda_{2}<math> offers no difficulty in principle, but we will not consider it here). Now, as <math>k \rightarrow \infty<math>, clearly the normalized result converges to <math>v_{1}<math>, but in floating point, this occurs after a finite number of iterations, <math>n-1<math>.
Next, note that by using only the final result <math>A^{n-1}b<math>, we are throwing away a lot of useful computation. What if instead, we form the <math>m \times n<math> so-called Krylov matrix:
- <math>K_{m} = \begin{bmatrix}b & Ab & A^{2}b & \cdots & A^{n-1}b \end{bmatrix}<math>
The columns of this matrix are not orthogonal, but in principle, we can extract an orthogonal basis, via a method such as Gram-Schmidt orthogonalization. The resulting vectors are a basis of the Krylov subspace, <math>\mathcal{K}_{n}<math>. We expect the vectors of this basis to give good approximations of the eigenvectors corresponding to the <math>n<math> largest eigenvalues, for the same reason that <math>A^{n-1}b<math> approximates <math>v_{1}<math>.
The Arnoldi iteration
The process described above is intuitive. Unfortunately, it is also unstable. This is where the Arnoldi iteration enters.
|