next up previous
Next: Shared and Distributed Architectures Up: No Title Previous: No Title

Why Supercomputers?

Because there are many very large problems that people want answers to! From the design of modern aircraft to creating Toy Story, from chemical structure determination to searching through terabyte databases, the demand for ever more computing horsepower has led to an evolution of "the currently best computer architectures". To give you an idea of how large problems can get, consider the design of modern airplanes, engineered today mostly by simulation and with only (at most) a little wind tunnel measurements. The governing physics is expressed by the Navier-Stokes equations. These are a set of three time-dependent partial differential equations for the fluid (air) momentum, and a fourth PDE expressing the conservation of mass. A very typical grid around the exterior of an aircraft may have several hundred grid points or elements in each space direction. Lets take a conservative estimate - say 400 points. That means there are 6.4 X10 ^7 grid points. At each grid point there are 4 variables, three velocities and either a density or a pressure, for a total of 2.56 X10^8 unknowns. To reach a steady-state, mimicing a plane flying at constant velocity in a fixed wind requires anywhere from 100 to 10,000 timesteps, for a total of around 10^11 variable updates. Knowing what is involved in solving the Navier-Stokes equations, you can estimate that there are between 10^12 and 10^13 calculations that must be performed - and about an equal number of data loads and stores. Even with close-to-gigahertz processors and broad bandwidth buses, that is a lot of required horsepower.

How do scientists gather together the resources to perform these large calculations? There are two basic approaches to the design of computers. First is a so-called shared memory machine. This means that every processor has access to all data. This kind of architecture, exemplified by SGI's Origin 2000 machine, is relatively easy for us as users to code, but performance doesn't scale all that well. The second alternative is a distributed memory machine, a cluster of computers hooked together by cables -- maybe ethernet, maybe fast optical networking, maybe something else. In this case, any processor knows about only a limited set of data. For a specific processor to use data not local to it, data messages must be sent from the location containing the required data. This kind of programming is more difficult on us as users - we get into the business of keeping track of what information is where, and shipping it around.

Lots of other details influence performance of either a shared or a distributed memory computer - how much cache is available, how fast are busses, how many operations each processor can perform in each clock cycle. We will look at these issues over the next months.

Over the next couple of weeks, Dr. Tom Furlani will talk about "MPI", which is a library of calls used in programming distributed memory machines, and Dr. Matt Jones and I will talk about OpenMP, which is a standard for programming shared memory machines. After that we will talk more about architectures, and we will start writing codes using MPI and OpenMP.


next up previous
Next: Shared and Distributed Architectures Up: No Title Previous: No Title

E. Bruce Pitman
Mon Nov 5 15:15:58 EST 2001