##
**Introduction to distributed algorithms.**
*(English)*
Zbl 0826.68056

Cambridge: Univ. Press. xii, 534 p. (1994).

This monograph introduces the reader into algorithmic aspects of the field of principles of distributed systems. Here, the term “distributed system” is used to mean an interconnected collection of autonomous computers, processes, or processors. To be qualified as ‘autonomous’, the nodes of the system (i.e., computers, processes, processors) must at least equipped with an own privat control; to be qualified as ‘interconnected’, the nodes must be able to exchange information.

The presented material is divided into three parts: Protocols, Fundamental algorithms, and Fault tolerance.

Part one, “Protocols”, deals with the communication protocols used in the implementation of computer communication networks. After describing in Chatper 2 the underlying communication model which is based on the notion of transition systems, in Chapter 3 the problem of massage transmission between two nodes of a distributed system is considered. Chapter 4 studies the problem of routing in computer networks. In Chapter 5, the discussion of computer networks ends with some ideas to avoid store-and-forward deadlocks in packet-switched computer networks.

Part two, “Fundamental algorithms” is the central and most extended part of the book. Beside of presenting a number of algorithms the procedures which are used in many distributed applications, useful theory is developed about the computational power of different network assumptions. In Chapter 6, wave and transversal algorithms are defined and investigated which are essential in many distributed-control problems. In Chapter 7 the fundamental problem of electing a single process that is to play a distinguished role in a subsequent computation is considered for ring networks as well as for arbitrary networks. Chapter 8 presents some algorithmic solutions for another fundamental problem in distributed computation, namely the termination detection. Besides the proof of a lower bound several classical algorithms are described in detail. In Chapter 9 the computational power of systems is studied where processes are not distinguished by unique identities. Furthermore, in this chapter, probabilistic algorithms are introduced and their use is discussed. Chapter 10 is devoted to the computation of snapshots (i.e., global views of the system’s state) which are useful for determining properties of the computation like deadlocks, or computation progress. In the final chapter of this part, in Chapter 11, the effect of the availability of a global time concept is studied. Due to the discussion of several degrees of synchronisms it is shown that the better the synchronism of a network, the lower the complexity of algorithms for these problems.

The third part “Fault tolerance” of the book is devoted to the study of fault tolerant systems and algorithms. While Chapter 13 studies the fault tolerance of asynchronous systems in detail, Chapter 14 discusses the fault tolerance of synchronous algorithms. Beside other things it is shown that synchronous systems offer more possibilities to tolerate nontrivial failures than asynchronous ones. Chapter 14 deals with the problem of implementing synchronism in an unreliable system. In the final chapter of the third part stabilizing algorithms are considered. Beside of developing some theory about stabilization several stabilizing algorithms are presented.

Beside of giving a foundated introduction into the field of algorithms for distributed systems the book provides useful background and reference information for professional engineers and researchers working in this field. Each chapter ends with a list of exercises and small projects. From the author, Gerhard Tel, a simulator program can be requested that can be used to experiment with the algorithms in the book or to test new algorithms.

The presented material is divided into three parts: Protocols, Fundamental algorithms, and Fault tolerance.

Part one, “Protocols”, deals with the communication protocols used in the implementation of computer communication networks. After describing in Chatper 2 the underlying communication model which is based on the notion of transition systems, in Chapter 3 the problem of massage transmission between two nodes of a distributed system is considered. Chapter 4 studies the problem of routing in computer networks. In Chapter 5, the discussion of computer networks ends with some ideas to avoid store-and-forward deadlocks in packet-switched computer networks.

Part two, “Fundamental algorithms” is the central and most extended part of the book. Beside of presenting a number of algorithms the procedures which are used in many distributed applications, useful theory is developed about the computational power of different network assumptions. In Chapter 6, wave and transversal algorithms are defined and investigated which are essential in many distributed-control problems. In Chapter 7 the fundamental problem of electing a single process that is to play a distinguished role in a subsequent computation is considered for ring networks as well as for arbitrary networks. Chapter 8 presents some algorithmic solutions for another fundamental problem in distributed computation, namely the termination detection. Besides the proof of a lower bound several classical algorithms are described in detail. In Chapter 9 the computational power of systems is studied where processes are not distinguished by unique identities. Furthermore, in this chapter, probabilistic algorithms are introduced and their use is discussed. Chapter 10 is devoted to the computation of snapshots (i.e., global views of the system’s state) which are useful for determining properties of the computation like deadlocks, or computation progress. In the final chapter of this part, in Chapter 11, the effect of the availability of a global time concept is studied. Due to the discussion of several degrees of synchronisms it is shown that the better the synchronism of a network, the lower the complexity of algorithms for these problems.

The third part “Fault tolerance” of the book is devoted to the study of fault tolerant systems and algorithms. While Chapter 13 studies the fault tolerance of asynchronous systems in detail, Chapter 14 discusses the fault tolerance of synchronous algorithms. Beside other things it is shown that synchronous systems offer more possibilities to tolerate nontrivial failures than asynchronous ones. Chapter 14 deals with the problem of implementing synchronism in an unreliable system. In the final chapter of the third part stabilizing algorithms are considered. Beside of developing some theory about stabilization several stabilizing algorithms are presented.

Beside of giving a foundated introduction into the field of algorithms for distributed systems the book provides useful background and reference information for professional engineers and researchers working in this field. Each chapter ends with a list of exercises and small projects. From the author, Gerhard Tel, a simulator program can be requested that can be used to experiment with the algorithms in the book or to test new algorithms.

Reviewer: C.Meinel (Trier)