zbMATH — the first resource for mathematics

Solving dense generalized eigenproblems on multi-threaded architectures. (English) Zbl 1278.65037
Summary: We compare two approaches to compute a fraction of the spectrum of dense symmetric definite generalized eigenproblems: one is based on the reduction to tridiagonal form, and the other on the Krylov-subspace iteration. Two large-scale applications, arising in molecular dynamics and material science, are employed to investigate the contributions of the application, architecture, and parallelism of the method to the performance of the solvers. The experimental results on a state-of-the-art 8-core platform, equipped with a graphics processing unit (GPU), reveal that in realistic applications, iterative Krylov-subspace methods can be a competitive approach also for the solution of dense problems.

65F15 Numerical computation of eigenvalues and eigenvectors of matrices
65Y05 Parallel numerical computation
Full Text: DOI
[1] Aliaga, Jose I.; Bollhöfer, Matthias; Martín, Alberto F.; Quintana-Ortí, Enrique S., Exploiting thread-level parallelism in the iterative solution of sparse linear systems, Parallel computing, 37, 3, 183-202, (2011) · Zbl 1216.65039
[2] Anderson, E.; Bai, Z.; Demmel, J.; Dongarra, J.E.; DuCroz, J.; Greenbaum, A.; Hammarling, S.; McKenney, A.E.; Ostrouchov, S.; Sorensen, D., LAPACK users’ guide, (1992), SIAM Philadelphia · Zbl 0843.65018
[3] ARPACK project home page. <http://www.caam.rice.edu/software/ARPACK/>.
[4] Badia, Rosa M.; Herrero, José R.; Labarta, Jesús; Pérez, Josep M.; Quintana-Ortí, Enrique S.; Quintana-Ortí, Gregorio, Parallelizing dense and banded linear algebra libraries using smpss, Concurrency and computation: practice and experience, 21, 18, 2438-2456, (2009)
[5] Barrachina, Sergio; Castillo, Maribel; Igual, Francisco D.; Mayo, Rafael; Quintana-Ortí, Enrique S.; Quintana-Ortí, Gregorio, Exploiting the capabilities of modern GPUs for dense matrix computations, Concurrency and computation: practice and experience, 21, 18, 2457-2477, (2009)
[6] Bientinesi, P.; Dhillon, I.S.; van de Geijn, R., A parallel eigensolver for dense symmetric matrices based on multiple relatively robust representations, SIAM journal on scientific computing, 27, 1, 43-66, (2005) · Zbl 1087.65032
[7] Bientinesi, Paolo; Igual, Francisco D.; Kressner, Daniel; Petschow, Matthias; Quintana-Ortí, Enrique S., Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures, Concurrency and computation: practice and experience, 23, 7, 94-707, (2011)
[8] Bischof, C.H.; Lang, B.; Sun, X., Algorithm 807: the SBR toolbox—software for successive band reduction, ACM transaction on mathematic software, 26, 4, 602-616, (2000) · Zbl 1365.65104
[9] S. Blügel, G. Bihlmayer, D. Wortmann, C. Friedrich, M. Heide, M. Lezaic, F. Freimuth, and M. Betzinger. The Jülich FLEUR project, 1987. <http://www.flapw.de> .
[10] Buttari, Alfredo; Langou, Julien; Kurzak, Jakub; Dongarra, Jack, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel computing, 35, 1, 38-53, (2009)
[11] Cui, Q.; Bahar, I., Normal mode analysis theoretical and applications to biological and chemical systems, Mathematical and computational biology, (2005), Chapman & Hall/CRC
[12] CULA project home page. <http://www.culatools.com/>.
[13] D. Davidović, E.S. Quintana-Ortí, Applying OOC techniques in the reduction to condensed form for very large symmetric eigenproblems on GPUs, in: Proceedings of the 20th Euromicro Conference on Parallel Distributed and Network based Processing - PDP, 2012, pp. 442-449.
[14] Dhillon, Inderjit S.; Parlett, Beresford N., Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices, Linear algebra applications, 387, 1-28, (2004) · Zbl 1055.65048
[15] FLAME project home page. <http://www.cs.utexas.edu/users/flame/>.
[16] Giraud, L.; Langou, J.; Rozloznik, M.; van den Eshof, J., Rounding error analysis of the classic gram – schmidt orthogonalization process, Numerische Mathematik, 101, 1, 87-100, (2005) · Zbl 1075.65060
[17] Golub, Gene H.; Van Loan, Charles F., Matrix computations, (1996), The Johns Hopkins University Press Baltimore, 3rd edition · Zbl 0865.65009
[18] J.R. Lopéz-Blanco, J.I. Garzón, P. Chacón, iMod: multipurpose normal mode analysis in internal coordinates, Bioinformatics (2011), http://dx.doi.org/10.1093/bioinformatics/btr497.
[19] MAGMA project home page. <http://icl.cs.utk.edu/magma/>.
[20] NVIDIA Corporation. NVIDIA CUDA Compute Unified Device Architecture Programming Guide, 2.3.1 edition, August 2009.
[21] M. Petschow, E. Peise, P. Bientinesi, High-performance solvers for large-scale dense eigensolvers, Technical Report AICES-2011/09-X, AICES, RWTH-Aachen, 2011. · Zbl 1264.65054
[22] PLASMA project home page. <http://icl.cs.utk.edu/plasma/>.
[23] Quintana-Ortí, Gregorio; Quintana-Ortí, Enrique S.; van de Geijn, Robert; Zee, Field Van; Chan, Ernie, Programming matrix algorithms-by-blocks for thread-level parallelism, ACM transactions on mathematic software, 36, 3, 14:1-14:26, (2009) · Zbl 1364.65105
[24] Skjaerven, L.; Hollup, S.M.; Reuter, N., Normal mode analysis for proteins, J. mol. struct., 898, 42-48, (2009)
[25] Tama, F.; Brooks, C.L., SYMMETRY, FORM, and SHAPE: guiding principles for robustness in macromolecular machines, Annu. rev. bioph. biom., 35, 1, 115-133, (2006)
[26] Vasily Volkov, James Demmel. LU, QR and Cholesky factorizations using vector capabilities of GPUs, Technical Report UCB/EECS-2008-49, EECS Department, University of California, Berkeley, May 2008.
[27] Field G. Van Zee, libflame: The Complete Reference, 2009. <http://www.lulu.com>.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.