swMATH ID: 8686
Software Authors: EJ Im, K Yelick, R Vuduc
Description: Sparsity: Optimization Framework for Sparse Matrix Kernels. Sparse matrix–vector multiplication is an important computational kernel that performs poorly on most modern processors due to a low compute-to-memory ratio and irregular memory access patterns. Optimization is difficult because of the complexity of cache-based memory systems and because performance is highly dependent on the non-zero structure of the matrix. The SPARSITY system is designed to address these problems by allowing users to automatically build sparse matrix kernels that are tuned to their matrices and machines. SPARSITY combines traditional techniques such as loop transformations with data structure transformations and optimization heuristics that are specific to sparse matrices. It provides a novel framework for selecting optimization parameters, such as block size, using a combination of performance models and search. In this paper we discuss the optimization of two operations: a sparse matrix times a dense vector and a sparse matrix times a set of dense vectors. Our experience indicates that register level optimizations are effective for matrices arising in certain scientific simulations, in particular finite-element problems. Cache level optimizations are important when the vector used in multiplication is larger than the cache size, especially for matrices in which the non-zero structure is random. For applications involving multiple vectors, reorganizing the computation to perform the entire set of multiplications as a single operation produces significant speedups. We describe the different optimizations and parameter selection techniques and evaluate them on several machines using over 40 matrices taken from a broad set of application domains. Our results demonstrate speedups of up to 4× for the single vector case and up to 10× for the multiple vector case.
Homepage: http://hpc.sagepub.com/content/18/1/135.short
Related Software: SparseMatrix; PETSc; MKL; Trilinos; CUDA; ATLAS; CUSPARSE; OSKI; FFTW; STREAM benchmark; STREAM; PT-Scotch; ParMETIS; BLAS; SpMV; swSpTRSV; pOSKI; SMATER; Petabricks; yaSpMV
Cited in: 10 Documents

Citations by Year