Because there are many very large problems that people want answers to! From the design of modern aircraft to creating Toy Story, from chemical structure determination to searching through terabyte databases, the demand for ever more computing horsepower has led to an evolution of "the currently best computer architectures". To give you an idea of how large problems can get, consider the design of modern airplanes, engineered today mostly by simulation and with only (at most) a little wind tunnel measurements. The governing physics is expressed by the Navier-Stokes equations. These are a set of three time-dependent partial differential equations for the fluid (air) momentum, and a fourth PDE expressing the conservation of mass. A very typical grid around the exterior of an aircraft may have several hundred grid points or elements in each space direction. Lets take a conservative estimate - say 400 points. That means there are 6.4 X10 ^7 grid points. At each grid point there are 4 variables, three velocities and either a density or a pressure, for a total of 2.56 X10^8 unknowns. To reach a steady-state, mimicing a plane flying at constant velocity in a fixed wind requires anywhere from 100 to 10,000 timesteps, for a total of around 10^11 variable updates. Knowing what is involved in solving the Navier-Stokes equations, you can estimate that there are between 10^12 and 10^13 calculations that must be performed - and about an equal number of data loads and stores. Even with close-to-gigahertz processors and broad bandwidth buses, that is a lot of required horsepower.
We are going to illustrate many of the ideas in this course by calling on basic
matrix-by-vector multiplication, the real workhorse of scientific computing. We
begin by illustrating this multiplication in an iterative method to solve a basic
equation in linear algebra. Consider the vector
, and
the NXN matrix
. Given a fixed right-hand-side vector
, solve the equation
One approach to doing this is the so-called Jacobi iteration method. Begin with an
initial guess at the solution, lets call it
. Break A up into three pieces,
A=D+L+U, where D is the diagonal, and U and L are, respectively, the upper- and
lower-triangular parts of A. Successive iterations
, etc are computed
by the iteration
Think about the calculations involved in obtaining the right-hand-side. There is
multiplication by a matrix (at least almost all of a matrix), addition of a vector,
and division. If A is "full", then the L+U multiplication requires
operations,
and experience shows this method - even on modern architectures - is not
cost-effective. However if A is sparse, then the matrix multiplication by L+U
involves only O(N) multiplies, and on distributed memory machines, Jacobi iteration
can be a useful tool. We will flesh out these ideas during the course of the semester.
For now, as we begin by talking about computer architectures, i want you to start coding this simple solver. It will be used, modified, generalized several times. So write modularly and well.
HOMEWORK: Code the Jacobi method to solve the linear system Au=b, where
A is a matrix with -2 down the diagonal, 1 on the sub- and super-diagonals, and
b=0. Fix the conditions
(start with 0 and 1 resp). You can
solve the equation exactly, to test your code.