Next:
List of Tables
Up:
Managing Scheduled Routing With
Previous:
Contents
List of Figures
Bitreversal on a small mesh
Scheduled paths for bitreversal on a small mesh
Bitreversal routing (dynamic vs. scheduled routing)
Diamond-lattice scheduled routing
A virtual state machine transferring data
VFSM schedules on a small mesh
Data movement on a small mesh
NuMesh transfer actions
Broadcast abstraction
A high-level view of code generation
Simple application code
COP output for simple application
Final HLL code for node zero
COP code for matrix multiplication
COP code for Gaussian elimination
Gabriel code for an FFT star
LaRCS code for the n-body problem
A simple PCS program
CrOS code for a parallel prefix
Dependency graph for simple COP fragment
Simple abstract multi-threaded code example
Simple disjoint parallel example
COP disjoint example
One possible way to handle disjoint COP
NuMesh hardware architecture
NuMesh datapath
Dependency graph for simple COP fragment
Address allocation graph for simple COP fragment
Dependency graph for simple COP loop with deterministic routing
Comparative performance for transpose (cycles)
Comparative performance for transpose (wall-clock time)
Comparative performance for bitreverse (wall-clock time)
Prefix times (one-word prefix, cycles)
Prefix times (mesh size 64, cycles)
Broadcast implementations (one word, wall-clock time)
Broadcast implementations (1024 words, wall-clock time)
Online broadcast with dynamic vs. scheduled routing, wall-clock time
Matrix multiplication comparison, wall-clock time
Gaussian elimination comparison, wall-clock time
COP code for specifying nearest-neighbor connections
Back to
Chris Metcalf
's home page