Fast diagnosis of multiprocessor systems with random faults. (English) Zbl 0803.68004

Summary: Processors in a multiprocessor system fail independently with constant probability \(0< p< 1/2\). They can test one another and faulty processors have to be identified on the basis of test results. Fault-free processors diagnose other fault-free processors correctly and find faults in faulty ones with probability \(q\leq 1\) in each test. Faulty testers are unreliable: they may even behave maliciously. Tests are independent and in every time unit a processor can be involved in at most one test. We propose testing schemes which are fast, use few tests and are correct with probability converging to 1 as the size of the system grows.


68M15 Reliability, testing and fault tolerance of networks and computer systems
Full Text: DOI EuDML


[1] 1. D. ANGLUIN and L. G. VALIANT, Fast Probabilistic Algorithms for Hamiltonian Circuits and Matchings, J. Comput. System Sci., 1979, 18, pp. 155-193. Zbl0437.05040 MR532174 · Zbl 0437.05040
[2] 2. R. BEIGEL, S. R. KOSARAJU and G. F. SULLIVAN, Locating Faults in a Constant Number of Parallel testing Rounds, in Proceedings of the 1989 ACM Symposium on Parallel Algorithms and Architectures, pp. 189-198.
[3] 3. P. BERMAN and A. PELC, Distributed Probabilistic Fault Diagnosis for Multiprocessor Systems, Digest of Papers, FTCS-20, 1990, pp. 340-346.
[4] 4. D. M. BLOUGH, Fault Detection and Diagnosis in Multiprocessor Systems, Ph. D. Thesis, The John Hopkins University, 1988.
[5] 5. D. M. BLOUGH, G. F. SULLIVAN and G. M. MASSON, Almost Certain Diagnosis for Intermittenly Faulty Systems, Digest of Papers, FTCS-18, 1988, pp. 260-271.
[6] 6. D. M. BLOUGH, G. F. SULLIVAN and G. M. MASSON, Fault Diagnosis for Sparsely Interconnected Multiprocessor Systems, Digest of Papers, FTCS-19, 1989, pp. 62-69.
[7] 7. M. L. BLOUNT, Probabilistic Treatment of Diagnosis in Digital Systems, Digest of Papers, FTCS-7, 1077, pp. 72-77.
[8] 8. A. T. DAHBURA, System-Level Diagnosis: a Perspective for the Third Decade, in Concurrent Computation: Algorithms, Architectures, Technologies, Plenum Publ. Corp., 1988.
[9] 9. A. T. DAHBURA, K. K. SABNANI and L. L. KING, The Comparison Approach to Multiprocessor Fault Diagnosis, IEEE Trans. Comput., March 1987, 36, pp. 373-378.
[10] 10. D. FUSSELL and S. RANGARAJAN, Probabilistic Diagnosis of Multiprocessor Systems with Arbitrary Connectivity, Digest of Papers, FTCS-19, 1989, pp. 560-565.
[11] 11. T. HAGERUP and Ch. RÜB, A Guided Tour of Chernoff Bounds, Inf. Proc. Lett., 1989/1990, 33, pp. 305-308. Zbl0702.60021 MR1045520 · Zbl 0702.60021
[12] 12. S. N. MAHESHWARI and S. L. HAKIMI, On Models for Diagnosable Systems and Probabilistic Fault Diagnosis, IEEE Trans. Comput., March 1976, 25, pp. 228-236. Zbl0339.68047 MR530234 · Zbl 0339.68047
[13] 13. F. P. PREPARATA, R. T. CHIEN, On the Connection Assignment Problem of Diagnosable Systems, IEEE Trans. Electr. Comput., December 1967, 16, pp. 848-854. Zbl0189.16904 · Zbl 0189.16904
[14] 14. S. RANGARAJAN and D. FUSSEL, A Probabilistic Method for Fault Diagnosis of Multiprocessor Systems, Digest of Papers, FTCS-18, 1988, pp. 278-283.
[15] 15. E. R. SCHEINERMAN, Almost Sure Fault Tolerance in Random Graphs, SIAM J. Comput., 1987, 16, pp. 1124-1134. Zbl0654.68015 MR917044 · Zbl 0654.68015
[16] 16. E. SCHMEICHEL, S. L. HAKIMI, M. OTSUKA and G. SULLIVAN, On Minimizing Testing Rounds for Fault Identification, Digest of Papers, FTCS-18, 1988, pp. 266-271.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.