1 The (poorly optimized) code in this directory was originally written for a
2 j90 system, but finished on a c90. It should work on all Cray vector
3 computers. For the T3E and T3D systems, the `alpha' subdirectory at the
4 same level as the directory containing this file, is much better.
6 * `+' seems to be faster than `|' when combining carries.
8 * It is possible that the best multiply performance would be achived by
9 storing only 24 bits per element, and using lazy carry propagation. Before
10 calling i24mult, full carry propagation would be needed.
12 * Supply tasking versions of the C loops.