next up previous
Next: About this document Up: No Title Previous: Computations and memory

BLAS

The BLAS - Basic Linear Algebra Subprograms - are a library of subroutines designed to provide efficient computation of commonly-used linear algebra routines, like dot products, matrix-vector multiplies, and matrix-matrix multiplies. The naming convention is not unlike other libraries - the fist letter indicates precision, the rest gives a hint (maybe) of what the routine does, e.g. SAXPY, DGEMM.

The BLAS are divided into 3 levels: vector-vector, matrix-vector, and matrix-matrix. The biggest speed-up can be in level 3, because of the size of MM.

Examples:

Level 1

eqnarray58

Level 2

eqnarray60

Level 3

eqnarray63

Roughly, Level 1 can give about 20 Mflops, Level 2 about 30, and Level 3 about 60, on 1997-98 generation chips, IF THE PROBLEM SIZE IS BIG ENOUGH.

How efficient is the BLAS?

                      load/store       float ops            refs/ops
level 1
SAXPY                   3N               2N                  3/2

level 2
SGEMV                 MN+N+2M            2MN                 1/2

level 3
SGEMM                 2MN+MK+KN          2MNK                2/N



E. Bruce Pitman
Wed Sep 13 22:27:10 EDT 2000