Algorithm 1022

swMATH ID: 44616
Software Authors: Heavner, N.; Igual, F. D.; Quintana-Ortí, G.; Martinsson, P. G.
Description: Algorithm 1022: efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures. Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm “randUTV”” computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations such as matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods such as column-pivoted QR in most high-performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both shared-memory and distributed-memory architectures.
Homepage: https://dl.acm.org/doi/10.1145/3507466
Keywords: numerical linear algebra; rank-revealing matrix factorization; singular value decomposition; high performance; randomized methods; block algorithm; algorithm-by-blocks; randomized SVD
Related Software: randUTV; Algorithm 656; SuperMatrix; Algorithm 679; libflame; ScaLAPACK; LAPACK; FLAME
Cited in: 1 Document

Citations by Year