The language as currently defined does not allow for the HLL compiler to request any particular clustering of scheduled communications slots in the schedule. Instead, the stream router is allowed to schedule the appropriate amount of bandwidth as it chooses. Allowing for such clustering would mean that the application could send an entire message in a short space of time, though perhaps with a longer synchronization time at the start of the message before starting the I/O. The current compiler does not attempt to control stream scheduling, though it would be easy to extend it to request that messages of up to a certain size be allocated in single chunks of successive schedule slots.
More significantly, if the application has a known pattern of communications (say, a systolic matrix-multiply), it could be beneficial to specify the preferred order of operator scheduling at the interface, such that the application could do a sequence of read, read, compute, and write without blocking on the schedule for any operation. Currently, all the HLL compiler can do is find the critical inner loop, determine its timing, then specify a schedule length for the communications.