A cache-oblivious self-adaptive full multigrid method. (English) Zbl 1174.65550

Summary: This paper presents a new efficient way to implement multigrid algorithms on adaptively refined grids. To cope with todays demands in high-performance computing, we cannot do without such highly sophisticated numerical methods. But if we do not implement them very carefully, we lose a lot of efficiency in terms of memory usage: using trees for the storage of hierarchical multilevel data causes a large amount of non-local (in terms of the physical memory space) data accesses, and often requires the storage of pointers to neighbours to allow the evaluation of discrete operators (difference stencils, restrictions, interpolations, etc.). The importance of this problem becomes clear if we remember that storage and not the CPU is the bottleneck on modern computers.
We established a cache-oblivious and storage-minimizing algorithm based on the concept of space-tree grids combined with a cell-oriented operator evaluation, a linear ordering of grid cells along a space-filling curve, and a sophisticated construction of linearly processed data structures for vertex data. In this context, we could show that the implementation of a dynamically adaptive \(F\)-cycle is, first, very natural and, second, does not cause any overhead in terms of storage usage and access as adaptivity and multilevel data do not disturb the linear processing order of our data structures.


65N55 Multigrid methods; domain decomposition for boundary value problems involving PDEs
68P05 Data structures
65N50 Mesh generation, refinement, and adaptive methods for boundary value problems involving PDEs


Cachegrind; DiMEPACK
Full Text: DOI


[1] . Performance Optimization of Numerically Intensive Codes. SIAM: Philadelphia, PA, 2001. · Zbl 0985.68006
[2] Douglas, Parallel Algorithms and Applications 9 pp 195– (1996) · Zbl 1049.68886
[3] Douglas, Electronic Transactions on Numerical Analysis 10 pp 21– (2000)
[4] Data locality optimizations for iterative numerical algorithms and cellular automata on hierachical memory architectures. Ph.D. Thesis, Institut für Informatik, Universität Erlangen-Nürnberg, SCS Publishing House, 2004.
[5] . An overview of cache optimization techniques and cache-aware numerical algorithms. In Algorithms for Memory Hierarchies–Advanced Lectures, , (eds), Lecture Notes in Computer Science, vol. 2625. Springer: Berlin, 2003; 213–232. · Zbl 1024.68768
[6] Data locality optimizations for multigrid methods on structured grids. Ph.D. Thesis, Institut für Informatik, TU München, 2001.
[7] Space-Filling Curves. Springer: New York, 1994. · Zbl 0806.01019
[8] Cache-oblivious algorithms. Master Thesis, Massachusetts Institute of Technology, 1999.
[9] , , . Cache-oblivious algorithms. Proceedings of the 40th Annual Symposium on Foundations of Computer Science, New York, 1999; 285–297.
[10] Cache-oblivious algorithms and data structures. Lecture Notes from the EEF Summer School on Massive Data Sets, University of Aarhus, Denmark, June 27–July 1, Lecture Notes in Computer Science. Springer: Berlin, 2002.
[11] Griebel, Parallel Computing 25 pp 827– (1999)
[12] . Hash based adaptive parallel multilevel methods with space-filling curves. In NIC Series, Rollnik H, Wolf D (eds), vol. 9. 2002; 479–492.
[13] , . Domain decomposition for adaptive hp finite element methods. In Domain Decomposition Methods in Scientific and Engineering Computing, Proceedings of the 7th International Conference on Domain Decomposition, Keyes D, Xu J (eds), Contemporary Mathematics, vol. 180. 1994; 203–214.
[14] , . Efficient parallel adaptive finite element methods using self-scheduling data and computations. In High Performance Computing–HiPC’99, 6th International Conference, Calcutta, India, December 17–20, 1999, Proceedings, Banerjee P, Prasanna VK, Sinha BP (eds), HiPC, Lecture Notes in Computer Science, vol. 1745. 1999; 359–363.
[15] , , . A key based parallel adaptive refinement technique for finite element methods. In Proceedings Computational Techniques and Applications: CTAC’97, , (eds) World Scientific: Singapore, 1998; 577–584. · Zbl 1086.74525
[16] Adaptive Parallel Multilevel Methods for Partial Differential Equations. Universität Bonn: Habilitationsschrift, 2001.
[17] , . A parallel multilevel method for adaptively refined cartesian grids with embedded boundaries, American Institute of Aeronautics and Astronautics-2000-808, Aerospace Science Meeting and Exhibit, 38th, Reno, Nevada, January 10–13, 2000.
[18] Finite Elements. Theory, Fast Solvers and Applications in Solid Mechanics. Cambridge University Press: Cambridge, MA, 2001.
[19] Eine cache-optimale Implementierung der Finite-Elemente-Methode. Doctoral Thesis, Institut für Informatik, TU München, 2004.
[20] Entwicklung eines cache-optimalen 3D Finite-Element-Verfahrens für große Probleme. Doctoral Thesis, Institut für Informatik, TU München, 2004.
[21] Entwicklung eines cache-optimalen Finite-Element-Verfahrens zur Lösung d-dimensionaler Probleme. Diploma Thesis, Institut für Informatik, TU München, 2005.
[22] Günther, SIAM Journal on Scientific Computing
[23] Kriterien für die Selbstadaption cache-effizienter Mehrgitteralgorithmen. Diploma Thesis, Institut für Informatik, TU München, 2005.
[24] Fulton, Electronic Transactions on Numerical Analysis 15 pp 29– (2003)
[25] Adaptive solution of elliptic partial differential equations by hierarchical tensor product finite elements. Doctoral Thesis, Institut für Informatik, TU München, 2000.
[26] . DiMEPACK–A cache-optimized multigrid library. In Proceedings of International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2001), Las Vegas, Nevada, U.S.A., Arabnia (ed.), vol. I, 2001.
[27] , . Cachegrind: a cache-miss profiler. http://valgrind.kde.org/docs.html
[28] Adaptive Verfahren höherer Ordnung auf cache-optimalen Datenstrakturen für dreidimensionale Probleme. Doctoral Thesis, Institut für Informatik, TU München, 2005.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.