next up previous
Next: BLAS Up: No Title Previous: More on Data Structures

Computations and memory

Each time we access a piece of data, we should try to get as many floating point ops out of that reference as we can. A performance monitor is the flops per reference, and we want that to be high. First, lets look at a simplier set of examples.

Consider, for example, the operations listed

calc       flops/pass          operation 
1              2                v1_i = v1_i + a*v2_i 
2              8            v1_i = v1_i + s2*v2_i + s3*v3_i + s4*v4_i + s5*v5_i  
3              1                v1_i = v2_i/v3_i  
4              2                v1_i = v1_i + s2*v2_{idx(i)}  
5              2                v1_i = v2_i - v3_i*v1_{i-1}  
6              2                 s = s + v1_i *v2_i



E. Bruce Pitman
Wed Sep 13 22:27:10 EDT 2000