×

zbMATH — the first resource for mathematics

Blocked algorithms for the reduction to Hessenberg-triangular form revisited. (English) Zbl 1157.65348
Summary: We present two variants of C. B. Moler and G. W. Stewart’s algorithm for reducing a matrix pair to Hessenberg-triangular (HT) form with increased data locality in the access to the matrices [SIAM J. Numer. Anal. 10, 241–256 (1973; Zbl 0253.65019)]. In one of these variants, a careful reorganization and accumulation of Givens rotations enables the use of efficient level 3 BLAS. Experimental results on four different architectures, representative of current high performance processors, compare the performances of the new variants with those of the implementation of Moler and Stewart’s algorithm in subroutine DGGHRD from LAPACK, Dackland and Kågström’s two-stage algorithm for the HT form, and a modified version of the latter which requires considerably less flops.

MSC:
65F15 Numerical computation of eigenvalues and eigenvectors of matrices
65Y20 Complexity and performance of numerical algorithms
Software:
BLAS; CLAPACK; GEMM; LAPACK
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] E. Anderson, Z. Bai, C. H. Bischof, S. Blackford, J. W. Demmel, J. J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. C. Sorensen, LAPACK Users’ Guide, 3rd edn., SIAM, Philadelphia, PA, 1999. · Zbl 0934.65030
[2] C. H. Bischof, B. Lang, and X. Sun, A framework for symmetric band reduction, ACM Trans. Math. Softw., 26 (2000), pp. 581–601. · Zbl 1365.65103 · doi:10.1145/365723.365735
[3] C. H. Bischof and C. F. Van Loan, The WY representation for products of Householder matrices, SIAM J. Sci. Stat. Comput., 8 (1987), pp. S2–S13. · Zbl 0628.65033
[4] K. Braman, R. Byers, and R. Mathias, The multishift QR algorithm. I. Maintaining well-focused shifts and level 3 performance, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 929–947. · Zbl 1017.65031 · doi:10.1137/S0895479801384573
[5] M. Cosnard, J.-M. Muller, and Y. Robert, Parallel QR decomposition of a rectangular matrix, Numer. Math., 48 (1986), pp. 239–249. · Zbl 0589.65024 · doi:10.1007/BF01389871
[6] K. Dackland and B. Kågström, Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form, ACM Trans. Math. Softw., 25 (1999), pp. 425–454. · Zbl 0962.65041 · doi:10.1145/332242.332244
[7] J. J. Dongarra, D. C. Sorensen, and S. J. Hammarling, Block reduction of matrices to condensed forms for eigenvalue computations, J. Comput. Appl. Math., 27 (1989), pp. 215–227. (Reprinted in Parallel algorithms for numerical linear algebra, North-Holland, Amsterdam, 1990, pp. 215–227) · Zbl 0679.65025
[8] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edn., Johns Hopkins University Press, Baltimore, MD, 1996. · Zbl 0865.65009
[9] K. Goto and R. van de Geijn, High-performance implementation of the level-3 BLAS, ACM Trans. Math. Software, 35 (2008). · Zbl 1190.65064
[10] B. Kågström, P. Ling, and C. F. Van Loan, GEMM-based level 3 BLAS: Algorithms for the model implementations, ACM Trans. Math. Softw., 24 (1999), pp. 268–302. · Zbl 0930.65047 · doi:10.1145/292395.292412
[11] B. Kågström, P. Ling, and C. F. Van Loan, GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark, ACM Trans. Math. Softw., 24 (1999), pp. 303–316. · Zbl 0930.65047 · doi:10.1145/292395.292426
[12] B. Kågström and D. Kressner, Multishift variants of the QZ algorithm with aggressive early deflation, SIAM J. Matrix Anal. Appl., 29 (2006), pp. 199–227. (Also appeared as LAPACK working note 173.) · Zbl 1137.65017
[13] D. Kressner, Numerical Methods and Software for General and Structured Eigenvalue Problems, PhD thesis, TU Berlin, Institut für Mathematik, Berlin, Germany, 2004. · Zbl 1060.65588
[14] B. Lang, Effiziente Orthogonaltransformationen bei der Eigen- und Singulärwertzerlegung. Habilitationsschrift, 1997. · Zbl 0885.65040
[15] B. Lang, Using level 3 BLAS in rotation-based algorithms, SIAM J. Sci. Comput., 19 (1998), pp. 626–634. · Zbl 0912.65032 · doi:10.1137/S1064827595280211
[16] H. P. Märchy, On a modification of the QZ algorithm with fast Givens rotations, Computing, 38 (1987), pp. 247–259. · Zbl 0613.65033 · doi:10.1007/BF02240099
[17] J. J. Modi and M. R. B. Clarke, An alternative Givens ordering, Numer. Math., 43 (1984), pp. 83–90. · Zbl 0504.65013 · doi:10.1007/BF01389639
[18] C. B. Moler and G. W. Stewart, An algorithm for generalized matrix eigenvalue problems, SIAM J. Numer. Anal., 10 (1973), pp. 241–256. · Zbl 0253.65019 · doi:10.1137/0710024
[19] G. Quintana-Ortí and R. A. van de Geijn, Improving the performance of reduction to Hessenberg form, ACM Trans. Math. Softw., 32 (2006), pp. 180–194. · Zbl 1365.65094 · doi:10.1145/1141885.1141887
[20] H. Rutishauser, On Jacobi rotation patterns, in Proc. Sympos. Appl. Math., vol. XV, pp. 219–239, Amer. Math. Soc., Providence, R.I., 1963.
[21] R. Schreiber and C. F. Van Loan, A storage-efficient WY representation for products of Householder transformations, SIAM J. Sci. Stat. Comput., 10 (1989), pp. 53–57. · Zbl 0664.65025 · doi:10.1137/0910005
[22] H.-R. Schwarz, Die Reduktion einer symmetrischen Bandmatrix auf tridiagonale Form, Z. Angew. Math. Mech., 45 (1965), pp. T75–T77. · Zbl 0223.65010
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.