TY - GEN
T1 - Tolerating faults in synchronization networks
AU - Bhatt, Sandeep N.
AU - Chung, Fan R.K.
AU - Leighton, F. Thomson
AU - Rosenberg, Arnold L.
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1992.
PY - 1992
Y1 - 1992
N2 - A synchronization network (SN) consists of processing elements (PEs) at the leaves of a complete binary tree, with routing switches at interior nodes. We study the problem of rendering an SN tolerant to PE failures, by adding queues to its edges. We obtain the following results. In the worst-case, an N-PE SN whose edges have queues of capacity O(log log N) can tolerate the failure of a positive fraction of its PEs, no matter how the failed PEs are distributed; furthermore, this capacity requirement cannot be lowered by more than a small constant factor. In the expected-case, with probability exceeding 1-N-Ω(1), an N-PE SN whose edges have queues of capacity O(log log log N) can tolerate the failure of a positive fraction of its PEs; we do not know if this capacity requirement can be lowered. We also present an algorithm which, given an SN with queues of capacity C, salvages a maximum number of fault-free PEs; the running time is a low-degree polynomial in N even when C is as large as log(N/log N).
AB - A synchronization network (SN) consists of processing elements (PEs) at the leaves of a complete binary tree, with routing switches at interior nodes. We study the problem of rendering an SN tolerant to PE failures, by adding queues to its edges. We obtain the following results. In the worst-case, an N-PE SN whose edges have queues of capacity O(log log N) can tolerate the failure of a positive fraction of its PEs, no matter how the failed PEs are distributed; furthermore, this capacity requirement cannot be lowered by more than a small constant factor. In the expected-case, with probability exceeding 1-N-Ω(1), an N-PE SN whose edges have queues of capacity O(log log log N) can tolerate the failure of a positive fraction of its PEs; we do not know if this capacity requirement can be lowered. We also present an algorithm which, given an SN with queues of capacity C, salvages a maximum number of fault-free PEs; the running time is a low-degree polynomial in N even when C is as large as log(N/log N).
UR - http://www.scopus.com/inward/record.url?scp=85029760726&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029760726&partnerID=8YFLogxK
U2 - 10.1007/3-540-55895-0_391
DO - 10.1007/3-540-55895-0_391
M3 - Conference contribution
AN - SCOPUS:85029760726
SN - 9783540558958
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 12
BT - Parallel Processing
A2 - Bouge, Luc
A2 - Cosnard, Michel
A2 - Robert, Yves
A2 - Trystram, Denis
T2 - 2nd Joint International Conference on Vector and Parallel Processing, CONPAR 1992
Y2 - 1 September 1992 through 4 September 1992
ER -