×

Effective parallelization techniques for loop nests with non-uniform dependences. (English) Zbl 0976.68039

Summary: The parallelism of loop nests with non-uniform dependences is difficult to extract and ineffectively explored by the existing parallelization schemes. We propose new efficient techniques in extracting parallelism of loop nests with non-uniform dependences using their irregularity. By this way, current highly parallel multiprocessor systems such as multithreaded and clustering multiprocessor systems can be fully utilized. These four mechanisms are (a) parallelization part splitting, (b) partial parallelization decomposition, (c) irregular loop interchange and (d) growing pattern detection. They explore parallelisms of special parallel patterns for nested loops with nonuniform dependences. The loop transformations used in uniform loops are also applied in non-uniform dependence loops after legality tests. We apply the results of classical convex theory and detect special parallel patterns of dependence vectors. We also proposed an algorithm that combines above mechanisms to enhance parallelism. We demonstrate that our technique gives much better speedup and extracts more parallelism than the existing techniques. Thus, we are encouraged by these apparent enhancements to pursue further development.

MSC:

68N20 Theory of compilers and interpreters
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] DOI: 10.1109/71.298207 · Zbl 05107529 · doi:10.1109/71.298207
[2] DOI: 10.1145/360827.360844 · Zbl 0273.68012 · doi:10.1145/360827.360844
[3] DOI: 10.1109/71.149964 · Zbl 05106990 · doi:10.1109/71.149964
[4] DOI: 10.1109/TC.1987.1676942 · doi:10.1109/TC.1987.1676942
[5] Tseng S. Y., Proc. 1992 International Conference on Parallel and Distributed Systems. IEEE Computer Society pp 340– (1992)
[6] DOI: 10.1109/71.224217 · Zbl 05107125 · doi:10.1109/71.224217
[7] DOI: 10.1109/SPDP.1994.346179 · doi:10.1109/SPDP.1994.346179
[8] Shen Z., Proc. 1989 IEEE International Conference on Parallel Processing pp 145– (1989)
[9] DOI: 10.1145/73560.73588 · doi:10.1145/73560.73588
[10] DOI: 10.1016/0743-7315(92)90027-K · doi:10.1016/0743-7315(92)90027-K
[11] Bazaraa M. S., Linear Programming and Network Flows. (1990) · Zbl 0722.90042
[12] Banerjee U., Dependence Analysis for Supercomputing. (1988) · doi:10.1007/978-1-4684-6894-6
[13] DOI: 10.1109/12.508321 · Zbl 1049.68969 · doi:10.1109/12.508321
[14] DOI: 10.1109/71.503771 · Zbl 05107159 · doi:10.1109/71.503771
[15] Zima H., Supercompilersfor Parallel and Vector Computers. (1990)
[16] Wilson R., An OvervieIV of the SUIF Compiler System. (1996)
[17] Tseng S. Y., Proc. 1996 IPPS pp 23–
[18] Veenstra J. E., MINT Tutorial and User Manual. Tech. Rep. 452 (1994)
[19] Su J. P., Proc. ICS ’96 on Computer Architecture pp 160– (1996)
[20] Chul-Kwon Cho, International Conference on Parallel Processing pp 314– (1997)
[21] Swamy Punyanurtula, Parallel Algorithms and Applications 12 pp 113– (1997) · Zbl 0886.68043 · doi:10.1080/01495739708941418
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.