Much of this material is taken from Anne Greenbaum's book, Iterative Methods for Solving Linear Systems.
Given a vector x and a matrix A, the Krylov space of A is the
subspace spanned by
. Here we outline some
theory associated with Krylov space methods, which are iterative methods for
the solution of the linear problem Ax=b.
If M is a preconditioner, so
,
a simple iteration can be constructed by
That is, a procedure might read
Let the error be
. Then
Using the 2-norm
one can establish conditions for convergence.
We can improve upon the iteration formula by writing
where the scalar
is chosen to satisfy certain criteria.
For instance, if we try to minimize the 2-norm of
, then
If the matrix A is symmetric, positive definite, we can minimize the A-norm of the error,
which yields the method of steepest descent.
More generally, one can iterate by
where p is a search direction. From above, we find that
is orthogonal to
in the first case, and
is A-orthogonal to
in steepest descent.
If, in the first case,
we could make the search direction not only in the direction of
, but also orthogonal to
. Then the residual
is minimized in the plane spanned by
. That is,
where the coefficients are chosen to enforce the orthogonality.
If A is symmetric, positive definite, then one can show that the residual
is minimized over the whole subspace spanned by
which is the same subspace spanned by
This is the MINRES algorithm, although
MINRES can be extended to indefinite systems if implemented correctly.
If we apply steepest descent in the case of a symmetric, positive definite matrix A, so that the A-projection of the error in a direction that is A-orthogonal to the previous search direction is eliminated, i.e. in the direction
then we get Conjugate Gradient.
For a general A, the idea of minimizing over a subspace can be extended,
but at the price of orthogonalizing the vectors at each stage. That is,
imagine minimizing the 2-norm of
over the j-dimensional
space spanned by
.
Then
with
This implementation can fail. A better alogrithm is to be had if
we replace
by
. Unfortunately, this
implementation suffers from erratic convergence rates, unless A is
symmetric.
A very different approach can be found if we begin with the Arnoldi Algorithm:
Then we can derive the GMRES method as follows:
Given
, set
.
Here,
is a matrix that has the q's as column vectors,
and
is the solution to a least squares problem (solved by
a QR algorithm).
Because of the need to store all the search directions, GMRES can be expensive. In parctice, one usually re-starts the GMRES procedure every several steps. See Templates for more info.
If A symmetric, positive definite, the Gram-Schmidt procedure for constructing orthonormal basis of the Krylov space of A reduces to a 3 term recursion. This is not true if A is not symmetric. However, one can use a pair of 3 term resursions, one for A and one for the transpose, to obtain a bi-orthogonal basis. This produces the BiCG algorithm. BiCG suffers from erratic convergence; van der Vorst developed the BiCGSTAB to overcome this difficulty.
Nick Trefethen and colleagues showed that, among these various Krylov space methods, one can construct an example where any particular one of them could have the best convergence behavior and any one the worst.
MORAL: Iterative methods is a tough game, and what method is "best" usually depends on the problem at hand.