Home

Performing Work Efficiently in the Presence of Faults


Author(s) : Joseph Y. Halpern Cynthia Dwork, 
Publisher : N/A
Publication Date : 1998
ISSN : N/A
Abstract : Abstract. We consider a system of t synchronous processes that communicate only by sending messages to one another, and that together must perform n independent units of work. Processes may fail by crashing; we want to guarantee that in every execution of the protocol in which at least one process survives, all n units of work will be performed. We consider three parameters: the number of messages sent, the total number of units of work performed (including multiplicities), and time. We present three protocols for solving the problem. All three are work-optimal, doing O(n+ t) work. The first has moderate costs in the remaining two parameters, sending O(t p t) messages, and taking O(n + t) time. This protocol can be easily modified to run in any completely asynchronous system equipped with a failure detection mechanism. The second sends only O(t log t) messages, but its running time is large (O(t 2 (n+ t)2 n+t)). The third is essentially time-optimal in the (usual) case in which there are no failures, and its time complexity degrades gracefully as the number of failures increases.,