next up previous
Next: Extensions and Directions Up: An Exploration of Asynchronous Previous: Intractable

Benchmarks

Although the importance of asynchronous computation is clear, it remains to demonstrate its practicality. To evaluate the efficiency of our interpreter, we wrote Paris, * Lisp and Milan code to compute the standard matrix multiplication algorithm and the w function. We found that the Milan interpreter, with the power of the asynchronous model, performs only marginally slower than Paris on a typical synchronous problem and actually outperforms * Lisp on an asynchronous one.

The matrix multiplication programs generate two sample matrices and then form the product by creating a new matrix whose (i, j) elements are equal to the inner product of row i of the first matrix and column j of the second matrix.

The * Lisp and Paris programs have one major difference from Milan--since no facility exists for creating nested objects in these languages, the programs are written with matrices represented as linear structures. We give statistics for two versions of these programs. The multiread version has many communication collisions when reading each value to compute the inner products. The copy-scan version reads each value once and replicates the values using the Connection Machine's copy-scan primitive.

It is important to note here that Milan's ability to easily represent the two-dimensional structure of the matrix comes from dynamic allocation, a function present in PARALATION LISP, Crystal and Milan.

Table 3 gives the running time (in seconds) for each program on ${n}\times{n}$ matrices. The * Lisp programs were timed running interpreted on a Vax front-end to be consistent with the execution of Milan. Data marked with an asterisk (*) dates from the previous benchmarking suite (under a previous version of Paris).


  
Table 3: A comparison of 5 programs to compute matrix multiplication.

\begin{tabular}
{l\vert l\vert rr}
Language & Version & n & Time \\  \hline\hlin...
 ...  \hline
Milan & & 5 & 0.20 \\  & & 10 & 0.25 \\  & & 15 & 0.30 \\ \end{tabular}

The * Lisp program to compute the values of w(n) for large values of n would not run as stack space was depleted too quickly. The time necessary to run it can be roughly estimated by multiplying the Paris timings by a factor of ten (based on other comparisons of implementations using the same algorithm). Table 4 gives timings (in seconds) for the 3 different versions; we used an 8k CM-2 for all timings except the last (for which we used a 16k machine); the Milan timings were done with a subway delay of 32 cycles. We also used a 24-bit word size for every problem except the last, which required a 26-bit word.


  
Table 4: A comparison of 3 programs to compute the value of w(n) in parallel.

\begin{tabular}
{r\vert r\vert\vert r\vert r}
problem & total & unfactored & MIM...
 ...& 8.63 \\ 8000 & 261 & 77.11 & 9.39 \\ 16000 & 275 & 173.54 & 10.63\end{tabular}

We have shown that a SIMD problem (such as matrix multiplication) runs approximately ten times more slowly in Milan than in native Paris, but more than twice as fast as as a ``high-level'' language. Even more interesting, while Milan was outperformed by the unfactored Paris version of w(n) up to about the first thousand values, after that unfactored Paris grew exponentially more slowly while Milan grew linearly, with Paris nearly twenty times slower than Milan and falling quickly further behind for our largest test.[*]


next up previous
Next: Extensions and Directions Up: An Exploration of Asynchronous Previous: Intractable
Back to Chris Metcalf's home page