# zbMATH — the first resource for mathematics

Fast approximation of matrix coherence and statistical leverage. (English) Zbl 1437.65030
Summary: The statistical leverage scores of a matrix $$A$$ are the squared row-norms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recently-popular problems such as matrix completion and Nyström-based low-rank matrix approximation as well as in large-scale statistical data analysis applications more generally; moreover, they are of interest since they define the key structural nonuniformity that must be dealt with in developing fast randomized matrix algorithms. Our main result is a randomized algorithm that takes as input an arbitrary $$n \times d$$ matrix $$A$$, with $$n >> d$$, and that returns as output relative-error approximations to all $$n$$ of the statistical leverage scores. The proposed algorithm runs (under assumptions on the precise values of $$n$$ and $$d$$) in $$O(nd \log n)$$ time, as opposed to the $$O(nd^{2})$$ time required by the naïve algorithm that involves computing an orthogonal basis for the range of $$A$$. Our analysis may be viewed in terms of computing a relative-error approximation to an underconstrained least-squares approximation problem, or, relatedly, it may be viewed as an application of Johnson-Lindenstrauss type ideas. Several practically-important extensions of our basic result are also described, including the approximation of so-called cross-leverage scores, the extension of these ideas to matrices with $$n \approx d$$, and the extension to streaming environments.

##### MSC:
 65F55 Numerical methods for low-rank matrix approximation; matrix compression 65F15 Numerical computation of eigenvalues and eigenvectors of matrices 68W20 Randomized algorithms
Blendenpik; LSRN
Full Text: