×

Bayesian alignment of similarity shapes. (English) Zbl 1454.62361

Summary: We develop a Bayesian model for the alignment of two point configurations under the full similarity transformations of rotation, translation and scaling. Other work in this area has concentrated on rigid body transformations, where scale information is preserved, motivated by problems involving molecular data; this is known as form analysis. We concentrate on a Bayesian formulation for statistical shape analysis. We generalize the model introduced by P. J. Green and the first author [Biometrika 93, No. 2, 235–254 (2006; Zbl 1153.62020)] for the pairwise alignment of two unlabeled configurations to full similarity transformations by introducing a scaling factor to the model. The generalization is not straightforward, since the model needs to be reformulated to give good performance when scaling is included. We illustrate our method on the alignment of rat growth profiles and a novel application to the alignment of protein domains. Here, scaling is applied to secondary structure elements when comparing protein folds; additionally, we find that one global scaling factor is not in general sufficient to model these data and, hence, we develop a model in which multiple scale factors can be included to handle different scalings of shape components.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference

Citations:

Zbl 1153.62020
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Bookstein, F. L. (1991). Morphometric Tools for Landmark Data : Geometry and Biology . Cambridge Univ. Press, Cambridge. · Zbl 0770.92001
[2] Branden, C. and Tooze, J. (1999). Introduction to Protein Structure , 2nd ed. Garland, New York.
[3] Creedy, J. and Martin, V. L. (1994). A model for the distribution of prices. Oxford Bulletin of Economics and Statistics 56 67-76.
[4] Dryden, I. L., Hirst, J. D. and Melville, J. L. (2007). Statistical analysis of unlabeled point sets: Comparing molecules in chemoinformatics. Biometrics 63 237-251, 315. · Zbl 1122.62090
[5] Dryden, I. L. and Mardia, K. V. (1998). Statistical Shape Analysis . Wiley, Chichester. · Zbl 0901.62072
[6] Green, P. J. and Mardia, K. V. (2006). Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93 235-254. · Zbl 1153.62020
[7] Green, P. J., Mardia, K. V., Nyirongo, V. B. and Ruffieux, Y. (2010). Bayesian modelling for matching and alignment of biomolecules. In The Oxford Handbook of Applied Bayesian Analysis (A. O’Hagan and M. West, eds.) 27-50. Oxford Univ. Press, Oxford.
[8] Kenobi, K. and Dryden, I. L. (2012). Bayesian matching of unlabelled point sets using Procrustes and configuration models. Bayesian Anal. 7 547-566. · Zbl 1330.62138
[9] Kenobi, K., Dryden, I. L. and Le, H. (2010). Shape curves and geodesic modelling. Biometrika 97 567-584. · Zbl 1195.62110
[10] Kent, J. T. and Mardia, K. V. (2002). Modelling strategies for spatial-temporal data. In Spatial Cluster Modelling (A. B. Lawson and D. G. T. Denison, eds.) 213-226. Chapman & Hall/CRC, Boca Raton, FL.
[11] Kent, J. T., Mardia, K. V. and Taylor, C. C. (2010). Matching unlabelled configurations and protein bioinformatics. Technical report, Univ. Leeds.
[12] Kent, J. T., Mardia, K. V., Morris, R. J. and Aykroyd, R. G. (2001). Functional models of growth for landmark data. In Proceedings in Functional and Spatial Data Analysis (K. V. Mardia and R. G. Aykroyd, eds.) 109-115. Leeds Univ. Press, Leeds.
[13] Lye, J. and Martin, V. L. (1993). Robust estimation, nonnormalities and generalized exponential distributions. J. Amer. Statist. Assoc. 88 261-267. · Zbl 0775.62081
[14] Mardia, K. V. and Jupp, P. E. (2000). Directional Statistics . Wiley, Chichester. · Zbl 0935.62065
[15] Mardia, K. V. and Nyirongo, V. B. (2012). Bayesian hierarchical alignment methods. In Bayesian Methods in Structural Bioinformatics (T. Hamelryck, K. V. Mardia and J. Ferkinghoff-Borg, eds.) 209-232. Springer, New York. · Zbl 1239.92001
[16] Mardia, K. V., Nyirongo, V. B., Fallaize, C. J., Barber, S. and Jackson, R. M. (2011). Hierarchical Bayesian modelling of pharmacophores in bioinformatics. Biometrics 67 611-619. · Zbl 1217.62183
[17] Mardia, K. V., Fallaize, C. J., Barber, S., Jackson, R. M. and Theobald, D. L. (2013). Supplement to “Bayesian alignment of similarity shapes.” . · Zbl 1454.62361
[18] Orengo, C. A., Michie, A. D., Jones, D. T., Swindells, M. B. and Thornton, J. M. (1997). CATH: A hierarchic classification of protein domain structures. Structure 5 1093-1108.
[19] Rodriguez, A. and Schmidler, S. (2013). Bayesian protein structural alignment. Ann. Appl. Stat. · Zbl 1454.62387
[20] Ruffieux, Y. and Green, P. J. (2009). Alignment of multiple configurations using hierarchical models. J. Comput. Graph. Statist. 18 756-773.
[21] Schmidler, S. C. (2007). Fast Bayesian shape matching using geometric algorithms. In Bayesian Statistics 8 (J. M. Bernardo, J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. Smith and M. West, eds.) 471-490. Oxford Univ. Press, Oxford. · Zbl 1252.62005
[22] Srivastava, A. and Jermyn, I. H. (2009). Looking for shapes in two-dimensional cluttered point clouds. IEEE Trans. Pattern Anal. Mach. Intell. 31 1616-1629.
[23] Taylor, W. R., Thornton, J. M. and Turnell, W. G. (1983). An ellipsoidal approximation of protein shape. Journal of Molecular Graphics 1 30-38.
[24] Theobald, D. L. and Wuttke, D. S. (2006). Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem. Proc. Natl. Acad. Sci. USA 103 18521-18527. · Zbl 1160.62303
[25] Wilkinson, D. J. (2007). Discussion of “Fast Bayesian shape matching using geometric algorithms.” In Bayesian Statistics 8 (J. M. Bernardo, J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. Smith and M. West, eds.) 483-487. Oxford Univ. Press, Oxford. · Zbl 1252.62005
[26] Wu, T. D., Schmidler, S. C., Hastie, T. and Brutlag, D. L. (1998). Regression analysis of multiple protein structures. J. Comput. Biol. 5 585-595.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.