zbMATH — the first resource for mathematics

A principal component analysis for trees. (English) Zbl 1184.62100
Summary: The active field of functional data analysis (about understanding the variation in a set of curves) has been recently extended to object oriented data analysis, which considers populations of more general objects. A particularly challenging extension of this set of ideas is to populations of tree-structured objects. We develop an analog of principal component analysis for trees, based on the notion of tree-lines, and propose numerically fast (linear time) algorithms to solve the resulting problems to proven optimality. The solutions we obtain are used in the analysis of a data set of 73 individuals, where each data object is a tree of blood vessels in one person’s brain. Our analysis revealed a significant relation between the age of the individuals and their brain vessel structure.

62H25 Factor analysis and principal components; correspondence analysis
62P10 Applications of statistics to biology and medical sciences; meta analysis
05C90 Applications of graph theory
65C60 Computational problems in statistics (MSC2010)
fda (R)
Full Text: DOI
[1] Aylward, S. and Bullitt, E. (2002). Initialization, noise, singularities and scale in height ridge traversal for tubular object centerline extraction. IEEE Transactions on Medical Imaging 21 61-75.
[2] Banks, D. and Constantine, G. M. (1998). Metric models for random graphs. J. Classification 15 199-223. · Zbl 0912.62074
[3] Breiman, L. (1996). Bagging predictors. Mach. Learn. 24 123-140. · Zbl 0867.62055
[4] Breiman, L., Friedman, J. H., Olshen, J. A. and Stone, C. J. (1984). Classification and Regression Trees . Wadsworth, Belmont, CA. · Zbl 0541.62042
[5] Bullitt, E., Zeng, D., Ghosh, A., Aylward, S. R., Lin, W., Marks, B. L. and Smith, K. (2008). The effects of healthy aging on intracerebral blood vessels visualized by magnetic resonance angiography. Neurobiology of Aging .
[6] Collins, M. and Duffy, N. (2002). Convolution kernels for natural language. In Advances in Neural Information Processing Systems 14 625-632. MIT Press, Cambridge, MA.
[7] Eom, J.-H., Kim, S., Kim, S.-H. and Zhang, B.-T. (2006). A tree kernel-based method for protein-protein interaction mining from biomedical literature. In Knowledge Discovery in Life Science Literature, PAKDD 2006 International Workshop, Proceedings. Lecture Notes in Computer Science 3886 . Springer, Singapore.
[8] Everitt, B. S., Landau, S. and Leese, M. (2001). Cluster Analysis , 4th ed. Oxford Univ. Press, New York. · Zbl 1205.62076
[9] Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Theory and Practice . Springer, Berlin. · Zbl 1119.62046
[10] Holmes, S. (1999). Phylogenies: An overview. In Statistics and Genetics (Halloran and Geisser, eds.). IMA Volumes in Mathematics and Its Applications 112 81-119. Springer, New York. · Zbl 0939.92024
[11] Li, S., Pearl, D. K. and Doss, H. (2000). Phylogenetic tree constructure using Markov chain Monte Carlo. J. Amer. Statist. Assoc. 95 493-508.
[12] Pachter, L. and Sturmfels, B. (2005). Algebraic Statistics for Computational Biology . Cambridge Univ. Press, Cambridge, UK. · Zbl 1108.62118
[13] Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis . Springer, New York. · Zbl 1011.62002
[14] Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis , 2nd ed. Springer, New York. · Zbl 1079.62006
[15] Shawe-Taylor, J. and Cristianini, N. (2004). Kernel Methods for Pattern Analysis . Cambridge Univ. Press, New York. · Zbl 0994.68074
[16] Vert, J. P. (2002). A tree kernel to analyse phylogenetic profiles. Bioinformatics 18 Suppl. 1 276-284.
[17] Wang, H. and Marron, J. S. (2007). Object oriented data analysis: Sets of trees. Ann. Statist. 35 1849-1873. · Zbl 1126.62002
[18] Yamanishi, Y., Bach, F. and Vert, J. P. (2007). Glycan classification with tree kernels. Bioinformatics 23 1211-1216.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.