zbMATH — the first resource for mathematics

Examples
Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

Operators
a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
Fields
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
On the distribution of the largest eigenvalue in principal components analysis. (English) Zbl 1016.62078
Summary: Let $x_{(1)}$ denote the square of the largest singular value of an $n\times p$ matrix $X$, all of whose entries are independent standard Gaussian variates. Equivalently, $x_{(1)}$ is the largest principal component variance of the covariance matrix $X'X$, or the largest eigenvalue of a $p$-variate Wishart distribution with $n$ degrees of freedom and identity covariance. Consider the limit of large $p$ and $n$ with $n/p=\gamma\geq 1$. When centered by $\mu_p=(\sqrt{n-1}+\sqrt p)^2$ and scaled by $\sigma_p=\break (\sqrt{n-1}+\sqrt p)(1/\sqrt{n-1}+1/\sqrt p)^{1/3}$, the distribution of $x_{(1)}$ approaches the Tracy-Widom law [{\it C.A. Tracy} and {\it H. Widom}, J. Stat. Phys. 92, No. 5-6, 809-835 (1998; Zbl 0942.60099)] of order 1, which is defined in terms of a Painlevé II differential equation and can be numerically evaluated and tabulated by software. Simulations show the approximation to be informative for $n$ and $p$ as small as 5. The limit is derived via a corresponding result for complex Wishart matrices using methods from random matrix theory. The result suggests that some aspects of large $p$ multivariate distribution theory may be easier to apply in practice than their fixed $p$ counterparts.

MSC:
62H25Factor analysis and principal components; correspondence analysis
62H10Multivariate distributions of statistics
15B52Random matrices
33E17Painlevé-type functions
33C45Orthogonal polynomials and functions of hypergeometric type
60F05Central limit and other weak theorems
WorldCat.org
Full Text: DOI
References:
[1] Aldous, D. and Diaconis, P. (1999). Longest increasing subsequences: from patience sorting to the Baik-Deift-Johansson theorem. Bull. Amer. Math. Soc. 36 413-432. · Zbl 0937.60001 · doi:10.1090/S0273-0979-99-00796-X
[2] Anderson, T. W. (1963). Asymptotic theory for principal component analysis. Ann Math. Statist. 34 122-148 · Zbl 0202.49504 · doi:10.1214/aoms/1177704248
[3] Anderson, T. W. (1996). R. A. Fisher and multivariate analysis. Statist. Sci. 11 20-34. · Zbl 0955.62504 · doi:10.1214/ss/1032209662
[4] Bai, Z. D. (1999). Methodologies in spectral analysis of large dimensional random matrices: a review. Statist. Sinica 9 611-677. · Zbl 0949.60077
[5] Baik, J., Deift, P. and Johansson, K. (1999). On the distribution of the length of the longest increasing subsequence of random permutations. J. Amer. Math. Soc. 12 1119-1178. JSTOR: · Zbl 0932.05001 · doi:10.1090/S0894-0347-99-00307-0 · http://links.jstor.org/sici?sici=0894-0347%28199910%2912%3A4%3C1119%3AOTDOTL%3E2.0.CO%3B2-O&origin=euclid
[6] Baker, T. H., Forrester, P. J. and Pearce, P. A. (1998). Random matrix ensembles with an effective extensive external charge. J. Phys. A 31 6087-6101. · Zbl 0912.15030 · doi:10.1088/0305-4470/31/29/002
[7] Basor, E. L. (1997). Distribution functions for random variables for ensembles of positive Hermitian matrices, Comm. Math. Phys. 188 327-350. · Zbl 0905.47016 · doi:10.1007/s002200050167
[8] Buja, A., Hastie, T. and Tibshirani, R. (1995). Penalized discriminant analysis. Ann. Statist. 23 73-102. · Zbl 0821.62031 · doi:10.1214/aos/1176324456
[9] Constantine, A. G. (1963). Some non-central distribution problems in multivariate analysis. Ann. Math. Statist. 34 1270-1285. Deift, P. (1999a). Integrable systems and combinatorial theory. Notices Amer. Math. Soc. 47 631-640. Deift, P. (1999b). Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach. Amer. Math. Soc., Providence, RI. · Zbl 0123.36801 · doi:10.1214/aoms/1177703863
[10] Dunster, T. M. (1989). Uniform asymptotic expansions for Whittaker’s confluent hypergeometric functions. SIAM J. Math. Anal. 20 744-760. · Zbl 0673.33003 · doi:10.1137/0520052
[11] Dyson, F. J. (1970). Correlations between eigenvalues of a random matrix. Comm. Math. Phys. 19 235-250. · Zbl 0221.62019 · doi:10.1007/BF01646824
[12] Eaton, M. L. (1983). Multivariate Statistics: A Vector Space Approach. Wiley, NewYork. · Zbl 0587.62097
[13] Edelman, A. (1988). Eigenvalues and condition numbers of random matrices. SIAM J. Matrix Anal. Appl. 9 543-560. · Zbl 0678.15019 · doi:10.1137/0609045
[14] Edelman, A. (1991). The distribution and moments of the smallest eigenvalue of a random matrix of Wishart type. Linear Algebra Appl. 159 55-80. · Zbl 0738.15010 · doi:10.1016/0024-3795(91)90076-9
[15] Erdélyi, A. (1960). Asymptotic forms for Laguerre polynomials. J. Indian Math. Soc. 24 235-250. · Zbl 0105.05401
[16] Forrester, P. J. (1993). The spectrum edge of random matrix ensembles. Nuclear Phys. B 402 709-728. · Zbl 1043.82538 · doi:10.1016/0550-3213(93)90126-A
[17] Forrester, P. J. (2000). Painlevé transcendent evaluation of the scaled distribution of the smallest eigenvalue in the Laguerre orthogonal and symplectic ensembles. Technical report. www.lanl.gov URL: · http://www.lanl.gov
[18] Geman, S. (1980). A limit theorem for the norm of random matrices. Ann. Probab. 8 252-261. · Zbl 0428.60039 · doi:10.1214/aop/1176994775
[19] Gohberg, I. C. and Krein, M. G. (1969). Introduction to the Theory of Linear Non-selfadjoint Operators. Amer. Math. Soc., Providence, RI. · Zbl 0181.13504
[20] Hastings, S. P. and McLeod, J. B. (1980). A boundary value problem associated with the second Painlevé transcendent and the Korteweg-de Vries equation. Arch. Rational Mech. Anal. 73 31-51. · Zbl 0426.34019 · doi:10.1007/BF00283254
[21] Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press. · Zbl 0576.15001
[22] James, A. T. (1964). Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. Statist. 35 475-501. · Zbl 0121.36605 · doi:10.1214/aoms/1177703550
[23] Johansson, K. (1998). On fluctations of eigenvalues of random Hermitian matrices. Duke Math. J. 91 151-204. · Zbl 1039.82504 · doi:10.1215/S0012-7094-98-09108-6
[24] Johansson, K. (2000). Shape fluctuations and random matrices. Comm. Math. Phys. 209 437-476. · Zbl 0969.15008 · doi:10.1007/s002200050027
[25] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, NewYork. · Zbl 0432.62029
[26] Mar cenko, V. A. and Pastur, L. A. (1967). Distributions of eigenvalues of some sets of random matrices. Math. USSR-Sb. 1 507-536. · Zbl 0162.22501 · doi:10.1070/SM1967v001n04ABEH001994
[27] Mehta, M. L. (1991). Random Matrices, 2nd ed. Academic Press, NewYork. · Zbl 0780.60014
[28] Muirhead, R. J. (1974). Powers of the largest latent root test of = I. Comm. Statist. 3 513-524. · Zbl 0284.62026 · doi:10.1080/03610927408827154
[29] Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, NewYork. · Zbl 0556.62028
[30] Olver, F. W. J. (1974). Asymptotics and Special Functions. Academic Press, NewYork. · Zbl 0303.41035
[31] Preisendorfer, R. W. (1988). Principal Component Analysis in Meteorology and Oceanogaphy. North-Holland, Amsterdam.
[32] Riesz, F. and Sz.-Nagy, B. (1955). Functional Analysis. Ungar, NewYork. · Zbl 0070.10902
[33] Soshnikov, A. (1999). Universality at the edge of the spectrum in Wigner random matrices. Comm. Math. Phys. 207 697-733. · Zbl 1062.82502 · doi:10.1007/s002200050743
[34] Soshnikov, A. (2001). A note on universality of the distribution of the largest eigenvalues in certain classes of sample covariance matrices, Technical report, www.lanl.gov URL: · Zbl 1018.62042 · http://www.lanl.gov
[35] Stein, C. (1969). Multivariate analysis I. Technical report, Dept. Statistics Stanford Univ., pages 79-81. (Notes prepared by M. L. Eaton in 1966.)
[36] Szeg ö, G. (1967). Orthogonal Polynomials, 3rd ed. Amer. Math. Soc. Providence. · Zbl 65.0278.03
[37] Temme, N. M. (1990). Asymptotic estimates for Laguerre polynomials. J. Appl. Math. Phys. (ZAMP) 41 114-126. · Zbl 0688.33007 · doi:10.1007/BF00946078
[38] Tracy, C. A. and Widom, H. (1994). Level-spacing distributions and the Airy kernel. Comm. Math. Phys. 159 151-174. · Zbl 0789.35152 · doi:10.1007/BF02100489
[39] Tracy, C. A. and Widom, H. (1996). On orthogonal and symplectic matrix ensembles. Comm. Math. Phys. 177 727-754. · Zbl 0851.60101 · doi:10.1007/BF02099545
[40] Tracy, C. A. and Widom, H. (1998). Correlation functions, cluster functions, and spacing distributions for random matrices. J. Statis. Phys. 92 809-835. · Zbl 0942.60099 · doi:10.1023/A:1023084324803
[41] Tracy, C. A. and Widom, H. (1999). Airy kernel and Painlevé II. Technical report. www.lanl.gov solv-int/9901004. To appear in CRM Proceedings and Lecture Notes: ”Isomonodromic Deformations and Applications in Physics,” J. Harnad, ed. URL: · http://www.lanl.gov
[42] Tracy, C. A. and Widom, H. (2000). The distribution of the largest eigenvalue in the Gaussian ensembles. In Calogero-Moser-Sutherland Models (J. van Diejen and L. Vinet, eds.) 461-472. Springer, NewYork.
[43] Wachter, K. W. (1976). Probability plotting points for principal components. In Ninth Interface Symposium Computer Science and Statistics (D. Hoaglin and R. Welsch, eds.) 299-308. Prindle, Weber and Schmidt, Boston.
[44] Widom, H. (1999). On the relation between orthogonal, symplectic and unitary ensembles. J. Statist. Phys. 94 347-363. · Zbl 0935.60090 · doi:10.1023/A:1004516918143
[45] Wigner, E. P. (1955). Characteristic vectors of bordered matrices of infinite dimensions. Ann. Math. 62 548-564. JSTOR: · Zbl 0067.08403 · doi:10.2307/1970079 · http://links.jstor.org/sici?sici=0003-486X%28195511%292%3A62%3A3%3C548%3ACVOBMW%3E2.0.CO%3B2-8&origin=euclid
[46] Wigner, E. P. (1958). On the distribution of the roots of certain symmetric matrices. Ann. Math. 67 325-328. JSTOR: · Zbl 0085.13203 · doi:10.2307/1970008 · http://links.jstor.org/sici?sici=0003-486X%28195803%292%3A67%3A2%3C325%3AOTDOTR%3E2.0.CO%3B2-C&origin=euclid
[47] Wilks, S. S. (1962). Mathematical Statistics. Wiley, NewYork. · Zbl 0173.45805