×

A simple fault-tolerant adaptive and minimal routing approach in 3-D meshes. (English) Zbl 1023.68126

Summary: We propose a sufficient condition for minimal routing in 3-Dimensional (3-D) meshes with faulty nodes. It is based on an early work of the author on minimal routing in 2-dimensional meshes. Unlike many traditional models that assume all the nodes know global fault distribution or just adjacent fault information, our approach is based on the concept of limited global fault information. First, we propose a fault model called faulty cube in which all faulty nodes in the system are contained in a set of faulty cubes. Fault information is then distributed to limited number of nodes while it is still sufficient to support minimal routing. The limited fault information collected at each node is represented by a vector called extended safety level. The extended safety level associated with a node can be used to determine the existence of a minimal path from this node to a given destination. Specifically, we study the existence of minimal paths at a given source node, limited distribution of fault information, minimal routing, and deadlock-free and livelock-free routing. Our results show that any minimal routing that is partially adaptive can be applied in our model as long as the destination node meets a certain condition. We also propose a dynamic planar-adaptive routing scheme that offers better fault tolerance and adaptivity than the planar-adaptive routing scheme in 3-D meshes. Our approach is the first attempt to address adaptive and minimal routing in 3-D meshes with faulty nodes using limited fault information.

MSC:

68W05 Nonnumerical algorithms
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Dally W J. The J-Machine: System Support for Actors. Towards Open Information Science, Hewitt and Agha (eds.), MIT Press, 1992.
[2] Koeninger R K, Furtney M, Walker M. A shared memory MPP from cray research.Digital Technical Journal, Spring 1994, 6(2): 8–21.
[3] Wu J. Adaptive fault-tolerant routing in cube-based multicomputers using safety vectors.IEEE Transactions on Parallel and Distributed Systems, April, 1998, 9(4): 321–334. · Zbl 05107048 · doi:10.1109/71.667894
[4] Wu J. Reliable unicasting in faulty hypercubes using safety levels.IEEE Transactions on Computers, Feb., 1997, 46(2): 241–247. · doi:10.1109/12.565613
[5] Chen M S, Shin K G. Depth-first search approach for fault-tolerant routing in hypercube multicomputers.IEEE Transactions on Parallel and Distributed Systems, April, 1990, 1(2): 152–159. · Zbl 05106641 · doi:10.1109/71.80143
[6] Fleury E, Fraigniaud P. A general theory for deadlock avoidance in wormhole-ronted networks.IEEE Transactions on Parallel and Distributed Systems, July, 1998, 9(7): 626–638. · Zbl 05107444 · doi:10.1109/71.707539
[7] Wu J. Fault-tolerant adaptive and minimal routing in mesh-connected multicomputers using extended safety levels.IEEE Transactions on Parallel and Distributed Systems, Feb., 2000, 11(2): 149–159. · doi:10.1109/71.841751
[8] Chien A A, Kim J H. Planar-adaptive routing: Low-cost adaptive networks for multiprocessors.Journal of ACM, January, 1995, 42(1): 91–123. · Zbl 0886.68017 · doi:10.1145/200836.200856
[9] Boppana R V, Chalasani S. Fault tolerant wormhole routing algorithms for mesh networks.IEEE Transactions on Computers July, 1995, 44(7): 848–864. · Zbl 1053.68513 · doi:10.1109/12.392844
[10] Boura Y M, Das C R. Fault-tolerant routing in mesh networks. InProc. 1995 International Conference on Parallel Processing, 1995, I, pp.106–109.
[11] Libeskind-Hadas R, Brandt E. Origin-Based fault-tolerant routing in the mesh. InProc. the 1st International Symposium on High Performance Computer Architecture, Raleigh, North Carolina, Jan., 1995, pp. 102–111.
[12] Su C C, Shin K G. Adaptive fault-tolerant deadlock-free routing in meshes and hypercubes.IEEE Transactions on Computers, June, 1996, 45(6): 672–683. · Zbl 1049.68523
[13] Chen X, Wu J. Minimal routing in 3-D meshes using extended safety levels. InProc. ISATED International Conference on Parallel and Distributed Systems, Oct., 1998.
[14] Duato J, Yalamanchili S, Ni L. Interconnection networks: An engineering approach.IEEE Computer Society, Los, Alamitos, CA, 1997.
[15] Linder D H, Harden J C. An adaptive and fault-tolerant wormhole routing strategy fork-aryn-cubes.IEEE Transactions on Computers, Jan. 1991, 40(1): 2–12. · doi:10.1109/12.67315
[16] Wu J. A theory of fault-tolerant adaptive and minimal routing inn-dimensional meshes.The Computer Journal, 2002, 5(3): 349–363. · Zbl 1010.68005 · doi:10.1093/comjnl/45.3.349
[17] Panda D K. Issues in designing efficient and practical algorithms for collective communication on wormholerouted systems. InProc. the 1995 ICPP Workshop on Challenges for Parallel Processing, CRC Press, Oconomowc, Wisconsin, Aug., 1995, pp.8–15.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.