×

Extremal properties of principal curves in the plane. (English) Zbl 0867.62025

Summary: Principal curves were introduced to formalize the notion of “a curve passing through the middle of a data set”. Vaguely speaking, a curve is said to pass through the middle of a data set if every point on the curve is the average of the observations projecting onto it. This idea can be made precise by defining principal curves for probability densities.
We study principal curves in the plane. Like linear principal components, principal curves are critical points of the expected squared distance from the data. However, the largest and smallest principal components are extrema of the distance, whereas all principal curves are saddle points. This explains why cross-validation does not appear to be a viable method for choosing the complexity of principal curve estimates.

MSC:

62G07 Density estimation
62H99 Multivariate analysis
62J02 General nonlinear regression
62H25 Factor analysis and principal components; correspondence analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] DUCHAMP, T. and STUETZLE, W. 1993. The geometry of principal curves in the plane. Report TR 250, Dept. Statistics, Univ. Washington. Revised August 1995. Available through http:// www.stat.washington.edu:80/ tech.reports/ Z. URL:
[2] DUCHAMP, T. AND STUETZLE, W. 1996. Geometric properties of principal curves in the plane. In Z Robust Statistics, Data Analy sis, and Computer Intensive Methods Helmut Rieder,. ed. Springer Lecture Notes in Statistics 109 135 152. Z. · Zbl 0867.62025
[3] HASTIE, T. J. 1984. Principal curves and surfaces. Ph.D. dissertation, Stanford Univ. Z.
[4] HASTIE, T. J. and STUETZLE, W. 1989. Principal curves. J. Amer. Statist. Assoc. 84 502 516. JSTOR: · Zbl 0679.62048 · doi:10.2307/2289936
[5] SEATTLE, WASHINGTON 98195
[6] SEATTLE, WASHINGTON 98195 E-MAIL: duchamp@math.washington.edu E-MAIL: wxs@stat.washington.edu
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.