Summary: In many applications, the data consist of (or may be naturally formulated as) an matrix . It is often of interest to find a low-rank approximation to , i.e., an approximation to the matrix of rank not greater than a specified rank , where is much smaller than and . Methods such as the Singular Value Decomposition (SVD) may be used to find an approximation to which is the best in a well-defined sense. These methods require memory and time which are superlinear in and ; for many applications in which the data sets are very large this is prohibitive. Two simple and intuitive algorithms are presented which, when given an matrix , compute a description of a low-rank approximation to , and which are qualitatively faster than the SVD. Both algorithms have provable bounds for the error matrix . For any matrix , let and denote its Frobenius norm and its spectral norm, respectively. In the first algorithm, columns of are randomly chosen. If the matrix consists of those columns of (after appropriate rescaling), then it is shown that from approximations to the top singular values and corresponding singular vectors may be computed. From the computed singular vectors a description of the matrix may be computed such that and such that
holds with high probability for both . This algorithm may be implemented without storing the matrix in random access memory (RAM), provided it can make two passes over the matrix stored in external memory and use additional RAM. The second algorithm is similar except that it further approximates the matrix by randomly sampling rows of to form an matrix . Thus, it has additional error, but it can be implemented in three passes over the matrix using only constant additional RAM. To achieve an additional error (beyond the best rank approximation) that is at most , both algorithms take time which is polynomial in , , and , where is a failure probability; the first takes time linear in and the second takes time independent of and . Our bounds improve previously published results with respect to the rank parameter for both the Frobenius and spectral norms. In addition, the proofs for the error bounds use a novel method that makes important use of matrix perturbation theory. The probability distribution over columns of and the rescaling are crucial features of the algorithms which must be chosen judiciously.
[For Part I see the authors, ibid. 36, No. 1, 132–157 (2006; Zbl 1111.68147), reviewed above.]