×

A reindexing based approach towards mapping of DAG with affine schedules onto parallel embedded systems. (English) Zbl 1327.68048

Summary: We address the problem of optimally mapping uniform DAGs to systolic arrays, given an affine timing function. We introduce an automatic allocation method based on a preprocessing by reindexing that transforms the initial DAG into a new one that enables the well known projection method to minimize the number of processors along a number of directions. We demonstrate its superiority to other methods, and establish the space-optimality of the proposed method. We also show an upper bound on the number of processors that corresponds to the best space complexity that both the projection method, and the so-called grouping method can give for the initial DAG. We also describe how the new allocation method can be implemented in tools.

MSC:

68M14 Distributed systems
68M07 Mathematical problems of computer architecture
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems
68R10 Graph theory (including graph drawing) in computer science
PDFBibTeX XMLCite
Full Text: DOI HAL

References:

[2] Benaini, A.; Robert, Y., Space-minimal systolic arrays for Gaussian elimination and the algebraic path problem, Parallel Comput., 15, 211-225 (1990) · Zbl 0714.65027
[3] Bermond, J. C.; Peyrat, C.; Sakho, I.; Tchuenté, M., Parallelization of the Gaussian elimination on systolic arrays, J. Parallel Distrib. Comput., 33, 69-75 (1996)
[6] Cappello, P., A processor-time minimal systolic array for cubical mesh algorithms, IEEE Trans. Parallel Distrib. Syst., 3, 1, 4-13 (1992)
[7] Clauss, P.; Mongenet, C.; Perrin, G. R., Calculus of space-optimal mappings of systolic algorithms on processor arrays, IEEE Int. Conf. Appl. Specific Array Processors, 591-602 (1990)
[8] Clauss, P.; Perrin, G. R., Optimal mapping of systolic algorithms by regular instructions shifts, (Int. Conf. on Application-Specific Array Processors. Int. Conf. on Application-Specific Array Processors, ASAP’94 (1994), IEEE Computer Society Press: IEEE Computer Society Press San Francisco, CA, Silver Spring, MD), 224-235
[9] Darte, A.; Risset, T.; Robert, Y., Synthesizing systolic arrays: Some recent developments, (Int. Conf. on Application Specific Array Processors (1991), IEEE Computer Soc. Press), 372-386
[11] Djamegni, C. T., Synthesis of space-time optimal systolic algorithms for the Cholesky factorization, Discrete Math. Theoret. Comput. Sci., 5, 109-120 (2002) · Zbl 0994.68011
[12] Djamegni, C. T., Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays, J. Parallel and Distrib. Comput., 64, 3, 345-359 (2004) · Zbl 1101.68409
[14] Djamegni, C. T., Matrix product on modular linear systolic arrays for algorithms with affine schedule, J. Parallel and Distrib. Comput., 66, 3, 323-333 (2006) · Zbl 1122.68150
[15] Djamegni, C. T., Exécution d’un graphe cubique de tâches sur un réseau 2D et asymptotiquement optimal, African Revue in Informatics and Mathematics Applied ARIMA, 4, 53-65 (2006)
[16] Djamegni, C. T.; Quinton, P.; Rajopadhye, S.; Risset, T., Derivation of systolic algorithms for the algebraic path problem by recurrence transformations, Parallel Comput., 26, 1429-1445 (2000) · Zbl 0948.68222
[17] Djamegni, C. T.; Tchuenté, M., Scheduling of the DAG associated with pipeline inversion of triangular matrices, Parallel Process. Lett., 6, 1, 13-26 (1996)
[18] Djamegni, C. T.; Tchuenté, M., A new dynamic programming algorithm on two dimensional arrays, Parallel Process. Lett., 10, 1, 15-27 (2000)
[21] Ganapathy, K. N.; Wah, B. W., Optimal synthesis of algorithm-specific lower-dimensional processor arrays, IEEE Trans. Parallel Distrib. Syst., 7, 3, 274-287 (1996)
[23] Karp, R. M.; Miller, R. E.; Winograd, S., The organization of computations for uniform recurrence equations, J. ACM, 14, 3, 563-590 (1967) · Zbl 0171.38305
[24] Kienhuis, B.; Rijpkema, E.; Deprettere, E., Compaan: Deriving process networks from Matlab for embedded signal processing architectures, (Proceedings of the 8th Int. Workshop on Hardware/Software Codesign, CODES’2000 (2000), ACM: ACM San Diego), 13-17
[25] Lamport, L., The parallel execution of Do loops, Commun. ACM, 17, 2, 83-93 (1974) · Zbl 0273.68012
[26] Louka, B.; Tchuenté, M., Dynamic programming on two-dimensional systolic arrays, Inform. Process. Lett., 29, 2, 97-104 (1988) · Zbl 0656.90096
[27] Louka, B.; Tchuenté, M., Triangular matrix inversion on systolic arrays, Parallel Comput., 14, 2, 223-228 (1990) · Zbl 0708.65026
[29] Moldovan, D. I., On the analysis and synthesis of VLSI algorithms, IEEE Trans. Comput., C-31, 11, 1121-1126 (1982) · Zbl 0492.68032
[30] Peir, J. K.; Cytron, R., Minimum distance: A method for partitioning recurrences for multiprocessors, IEEE Trans. on Comput., 38, 8, 1203-1211 (1989)
[32] Quinton, P.; Rajopadhye, S.; Risset, T., Extension of the ALPHA language to recurrences on sparse periodic domains, IEEE Int. Conf. on Application Specific Array Processors, 391-401 (1996)
[33] Quinton, P.; Dongen, V. V., The mapping of linear equations on regular arrays, J. VLSI Signal Process., 1, 2, 95-113 (1989) · Zbl 0825.68431
[34] Rajopadhye, S. V., Synthesizing systolic arrays with control signal from recurrence equations, Distrib. Comput., 3, 88-105 (1989)
[35] Rajopadhye, S. V.; Purushothaman, S.; Fujimoto, R. M., Synthesizing systolic arrays from recurrences equations with linear dependencies, (Proceedings, Sixth Conference on Foundations of Software Technology and Theoretical Computer Science. Proceedings, Sixth Conference on Foundations of Software Technology and Theoretical Computer Science, LNCS, vol. 241 (1986), Springer Verlag), 488-503 · Zbl 0604.68027
[36] Rao, S.; Kailath, T., Regular iterative algorithms and their implementation on processors arrays, Proc. IEEE, 76, 4, 259-269 (1988)
[37] Sakho, I.; Tchuenté, M., Méthode de conception d’algorithmes par-allèles pour réseaux réguliers, Tech. Sci. Inform., 8, 63-72 (1989) · Zbl 0668.68060
[39] Shang, W.; Fortes, J. A.B., Time optimal linear schedules for algorithms with uniform dependencies, IEEE Trans. Comput., 40, 723-742 (1991) · Zbl 1395.68077
[40] Shang, W.; Fortes, J. A.B., On mapping of uniform dependence algorithms into lower dimensional processors arrays, IEEE Trans. Parallel Distrib. Syst., 3, 350-363 (1992)
[41] Schrijver, A., Theory of Linear and Integer Programming (1986), John Wiley and Sons: John Wiley and Sons New York · Zbl 0665.90063
[42] Tchuenté, M., Parallel Computation on Regular Arrays (1992), Manchester University Press, Manchester and Wiley: Manchester University Press, Manchester and Wiley NY, USA
[43] Teich, J.; Thiele, L., Partitioning of processor arrays: A piecewise regular approach. Integration, The VLSI J., 14, 297-332 (1993) · Zbl 0777.68024
[44] Thiele, L., Resource constrained scheduling of uniform algorithms, J. VLSI Signal Process., 10, 3, 295-310 (1995)
[45] Thiele, L.; Roychowdhury, V., Systematic design of local processor arrays for numerical algorithms, (Deprettere, E. F.; Van der Veen, A. J., Algorithms and Parallel VLSI Architectures. Algorithms and Parallel VLSI Architectures, Tutorials, Vol. A (1991), Elsevier: Elsevier Amsterdam), 329-339 · Zbl 0793.68003
[46] Wong, Y.; Delosme, J. M., Transformation of broadcasts into propagations in systolic algorithms, J. Parallel Distrib. Comput., 14, 121-145 (1992)
[48] Yeh, J. W.; Cheng, W. J.; Jen, C. W., VASS a VLSI array system synthesizer, J. VLSI Signal Process., 12, 135-158 (1996)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.