The matrix multiplication programs generate two sample matrices and then form the product by creating a new matrix whose (i, j) elements are equal to the inner product of row i of the first matrix and column j of the second matrix.
The * Lisp and Paris programs have one major difference from Milan--since no facility exists for creating nested objects in these languages, the programs are written with matrices represented as linear structures. We give statistics for two versions of these programs. The multiread version has many communication collisions when reading each value to compute the inner products. The copy-scan version reads each value once and replicates the values using the Connection Machine's copy-scan primitive.
It is important to note here that Milan's ability to easily represent the two-dimensional structure of the matrix comes from dynamic allocation, a function present in PARALATION LISP, Crystal and Milan.
Table 3 gives the running time (in seconds) for each
program on
matrices. The * Lisp programs were timed
running interpreted on a Vax front-end to be consistent with the
execution of Milan. Data marked with
an asterisk (*) dates from the previous benchmarking suite (under
a previous version of Paris).
The * Lisp program to compute the values of w(n) for large values of n would not run as stack space was depleted too quickly. The time necessary to run it can be roughly estimated by multiplying the Paris timings by a factor of ten (based on other comparisons of implementations using the same algorithm). Table 4 gives timings (in seconds) for the 3 different versions; we used an 8k CM-2 for all timings except the last (for which we used a 16k machine); the Milan timings were done with a subway delay of 32 cycles. We also used a 24-bit word size for every problem except the last, which required a 26-bit word.
We have shown that a SIMD problem (such as matrix multiplication) runs
approximately ten times more slowly in Milan than in native Paris,
but more than twice as fast as as a ``high-level'' language.
Even more interesting, while Milan was outperformed by the unfactored
Paris version of w(n) up to about the first thousand values, after that
unfactored Paris grew exponentially more slowly while Milan grew linearly,
with Paris nearly twenty times slower than Milan and falling quickly
further behind for our largest test.