×

Large sample covariance matrices and high-dimensional data analysis. (English) Zbl 1380.62011

Cambridge Series in Statistical and Probabilistic Mathematics 39. Cambridge: Cambridge University Press (ISBN 978-1-107-06517-8/hbk; 978-1-107-58808-0/hbk). xiv, 308 p. (2015).
This book deals with the analysis of covariance matrices under two different assumptions: large-sample theory and high-dimensional-data theory. While the former approach is the classical framework to derive asymptotics, nevertheless the latter has received increasing attention due to its applications in the emerging field of big-data. Due to its novelty and its relevance in the current research, the authors focus mainly on the high-dimensional-data framework. Basically, large-sample theory considers the number \(p\) of variables as fixed, and the corresponding asymptotics are studied when the number \(n\) of observations goes to infinity. On the other hand, under the high-dimensional-data approach, also the number of variables goes to infinity, and the classical assumption is that \((p/n) \rightarrow y > 0\) when \(n\) goes to infinity.
In the first chapters, the authors present the basic results about covariance matrices, Fisher matrices and random matrices, namely the Marchenko-Pastur distribution and its properties, the \(T^2\) distribution, and the asymptotics for the spectral statistics.
From Chapter 6 on, several applications are illustrated. In Chapter 6, the classical problem of classification is analyzed, showing how to face the problems of classifying observations into one of two (or several) normal distributions. Chapter 9 deals with another classical problem, i.e. testing equalities of covariance matrices.
The theory and the applications are presented under both the large-sample theory and the high-dimensional-data theory, and thus the reader can easily appreciate the differences between the two approaches.
The material is presented in a quite simple manner, and the reader only needs some prerequisites in basic mathematical statistics, linear algebra, and theory of multivariate normal distributions. Some technical prerequisites are collected in two appendices. Therefore, the book can be used by graduate students and researchers in a wide range of disciplines, ranging from mathematics to applied sciences (engineering, computer science, life sciences).

MSC:

62-02 Research exposition (monographs, survey articles) pertaining to statistics
62-07 Data analysis (statistics) (MSC2010)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H20 Measures of association (correlation, canonical correlation, etc.)
62H15 Hypothesis testing in multivariate analysis
62H12 Estimation in multivariate analysis
PDFBibTeX XMLCite
Full Text: DOI