|
Abstract : |
Scheduling computations with communications is the theoretical basis for achieving efficient parallelism on distributed memory systems. We generalize Graham's task-level in a manner to incorporate the effects of computation and communication. A new scheduling is proposed by combining task priority with efficient management of processor idle time. We also propose an optimization called Iterative Refinement Scheduling (IRS) that iteratively schedules the forward and backward computation graph. The task-level used in some scheduling iteration is obtained from the schedule generated in the previous iteration. Each iteration produces a new schedule and new task-levels. This approach enables searching and optimizing solutions as the result of using more refined task-level in each scheduling iteration. Evaluation and analysis of the results are carried out for different instances of communication granularities and problem parallelism. It is shown that solutions obtained out of few iterations statistically outperforms those generated by other recently proposed scheduling. IRS allows exploring a space of solutions whose size grows with the amount of parallelism and communication granularity. IRS enables optimizing the solution specially for critical instances such as fine-grain computations and large parallelism. (C) 2000 Elsevier Science B.V. All rights reserved., Scheduling DAGs with communication times is the theoretical basis for achieving efficient parallelism on distributed memory systems. We generalize Graham's task-level in a manner to incorporate the effects of computation, data size, and network latency. A new scheduling that uses the pro-posed task-level to make early reservation of resources for critical computation and communication is proposed. We also propose an optimization called Iterative Refinement Scheduling (IRS) that alternatively schedules the computation graph and its associated reverse. The task-level used in some scheduling iteration is the task's starting time that is achieved in the very previous iteration. IRS enables searching and optimizing solutions as the result of using more refined task-level in each scheduling iteration. Evaluation and analysis of the results are carried out for different instances of problem granularities, parallelism, and network latency such as the fully connected, hypercube, and ring. The finish time obtained from two-iteration scheduling outperforms those generated by other recently proposed scheduling as well as the clustering approaches. IRS allows exploring a space of solutions whose size grows with the amount of parallelism and communication granularity. Solutions generated by IRS largely outperform all known approaches specially for ne-grain problems where other approaches fail. IRS enables optimizing the solution specially for critical instances such as ne-grain DAGs and large parallelism. The time complexity of one IRS iteration is O(pn2)., Scheduling computations with communications is the theoretical basis for achieving efficient parallelism on distributed memory systems. We generalize Graham's task-level in a manner to incorporate the effects of computation and communication. A new scheduling is proposed by combining task priority with ecient management of processor idle time. We also propose an optimization called Iterative Refinement Scheduling (IRS) that iteratively schedules the forward and backward computation graph. The task-level used in some scheduling iteration is obtained from the schedule generated in the previous iteration. Each iteration produces a new schedule and new task-levels. This approach enables searching and optimizing solutions as the result of using more refined task-level in each scheduling iteration. Evaluation and analysis of the results are carried out for different instances of communication granularities and problem parallelism. It is shown that solutions obtained out of few iterations statistically outperforms those generated by other recently proposed scheduling. IRS allows exploring a space of solutions whose size grows with the amount of parallelism and communication granularity. IRS enables optimizing the solution specially for critical instances such as fine-grain computations and large parallelism., |