swMATH ID: 
44616

Software Authors: 
Heavner, N.; Igual, F. D.; QuintanaOrtí, G.; Martinsson, P. G.

Description: 
Algorithm 1022: efficient algorithms for computing a rankrevealing UTV factorization on parallel computing architectures. Randomized singular value decomposition (RSVD) is by now a wellestablished technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm “randUTV”” computes a full factorization of a given matrix that provides lowrank approximations with nearoptimal error. Because the bulk of randUTV is cast in terms of communicationefficient operations such as matrixmatrix multiplication and unpivoted QR factorizations, it is faster than competing rankrevealing factorization methods such as columnpivoted QR in most highperformance computational settings. In this article, optimized randUTV implementations are presented for both sharedmemory and distributedmemory computing environments. For shared memory, randUTV is redesigned in terms of an algorithmbyblocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely forkjoin approach. The distributedmemory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both sharedmemory and distributedmemory architectures. 
Homepage: 
https://dl.acm.org/doi/10.1145/3507466

Keywords: 
numerical linear algebra;
rankrevealing matrix factorization;
singular value decomposition;
high performance;
randomized methods;
block algorithm;
algorithmbyblocks;
randomized SVD

Related Software: 
randUTV;
Algorithm 656;
SuperMatrix;
Algorithm 679;
libflame;
ScaLAPACK;
LAPACK;
FLAME

Cited in: 
1 Document
