Matching markers and unlabeled configurations in protein gels. (English) Zbl 1401.92080

Summary: Unlabeled shape analysis is a rapidly emerging and challenging area of statistics. This has been driven by various novel applications in bioinformatics. We consider here the situation where two configurations are matched under various constraints, namely, the configurations have a subset of manually located “markers” with high probability of matching each other while a larger subset consists of unlabeled points. We consider a plausible model and give an implementation using the EM algorithm. The work is motivated by a real experiment of gels for renal cancer and our approach allows for the possibility of missing and misallocated markers. The methodology is successfully used to automatically locate and remove a grossly misallocated marker within the given data set.


92C40 Biochemistry, molecular biology
62P10 Applications of statistics to biology and medical sciences; meta analysis
92C50 Medical applications (general)


lp_solve; lpSolve
Full Text: DOI arXiv Euclid


[1] Banks, R. E., Dunn, M. J., Hochstrasser, D. F., Sanchez, J. C., Blackstock, W., Pappin, D. J. and Selby, P. J. (2000). Proteomics: New perspectives, new biomedical opportunities. Lancet 356 1749-1756.
[2] Berkelaar, M. (2008). Interface to lp_solve v. 5.5 to solve linear/integer programs, R package.
[3] Besl, P. J. and McKay, N. D. (1992). A method for registration of 3-D shapes. IEE Trans. PAMI 14 239-256.
[4] Chen, P. (2011). A novel kernel correlation model with the correspondence estimation. J. Math. Imaging Vision 39 100-120. · Zbl 1255.94010
[5] Chui, H. and Rangarajan, A. (2003). A new point matching algorithm for non-rigid registration. Computer Vision and Understanding 89 114-141. · Zbl 1053.68123
[6] Czogiel, I., Dryden, I. L. and Brignell, C. J. (2011). Bayesian matching of unlabeled marked point sets using random feilds, with an application to molecular alignment. Ann. Appl. Stat. 5 2603-2629. · Zbl 1234.62141
[7] Dryden, I. L., Hirst, J. D. and Melville, J. L. (2007). Statistical analysis of unlabeled point sets: Comparing molecules in chemoinformatics. Biometrics 63 237-251, 315. · Zbl 1122.62090
[8] Dryden, I. L. and Mardia, K. V. (1998). Statistical Shape Analysis . Wiley, Chichester. · Zbl 0901.62072
[9] Dryden, I. L. and Walker, G. (1999). Highly resistance regression and object matching. Biometrics 55 820-825. · Zbl 1059.62641
[10] Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24 381-395.
[11] Forgber, M., Gellrich, S., Sharav, T., Sterry, W. and Walden, P. (2009). Proteome-based analysis of serologically defined tumor-associated antigens in cutaneous lymphona. PloS ONE 4 e8376.
[12] Glaunes, J., Trouvé, A. and Younes, L. (2004). Diffeomorphic matching of mistributions: A new approach for unlabelled point-sets and sub-manifolds matching. CVPR 2 712-718.
[13] Green, P. J. and Mardia, K. V. (2006). Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93 235-254. · Zbl 1153.62020
[14] Green, P. J., Mardia, K. V., Nyirongo, V. B. and Ruffieux, Y. (2010). Bayesian modelling for matching and alignment of biomolecules. In The Oxford Handbook of Applied Bayesian Analysis 27-50. Oxford Univ. Press, Oxford.
[15] Kent, J. T., Mardia, K. V. and Taylor, C. C. (2010a). Matching unlabelled configurations and protein bioinformatics. Research Report STAT10-01. Univ. Leeds, Leeds, UK.
[16] Kent, J. T., Mardia, K. V. and Taylor, C. C. (2010b). An EM interpretation of the Softassign algorithm for alignment problems. In LASR 10 -High-throughput sequencing , proteins and statistics (A. Gusnanto, K. V. Mardia, C. J. Fallaize and J. Voss, eds.) 29-32. Dept. Statistics, Univ. Leeds, Leeds, UK.
[17] Mardia, K. V., Petty, E. M. and Taylor, C. C. (2012). Supplement to “Matching markers and unlabeled configurations in protein gels.” . · Zbl 1401.92080
[18] McLachlan, G. J. and Krishnan, T. (2008). The EM Algorithm and Extensions , 2nd ed. Wiley, Hoboken, NJ. · Zbl 1165.62019
[19] Murphy-Chutorian, E. and Trivedi, M. M. (2008). Head pose estimation in computer vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 31 607-626.
[20] Petty, E. M. (2009). Shape analysis in bioinformatics. Ph.D. thesis, Univ. Leeds, Leeds, UK.
[21] Rangarajan, A., Chui, H. and Bookstein, F. L. (1997). The Softassign Procrustes matching algorithm. In Information Processing in Medical Imaging 15 th International Conference , IPMI’ 97 Poultney 29-42. Springer, New York.
[22] Rohr, K., Cathier, P. and Wörz, S. (2004). Elastic registration of electrophoresis images using intensity information and point landmarks. Pattern Recognition 37 1035-1048. · Zbl 1056.68582
[23] Taylor, C. C., Mardia, K. V. and Kent, J. T. (2003). Matching unlabelled configurations using the EM algorithm. In LASR Proceedings : Stochastic Geometry , Biological Structure and Images (R. G. Aykroyd, K. V. Mardia and M. J. Langdon, eds.) 19-21. Dept. Statistics, Univ. Leeds, Leeds, UK.
[24] Tsin, Y. and Kanade, T. (2004). A correlation-based approach to robust point set registration. In Computer Vision- ECCV. Lecture Notes in Comput. Sci. 3023 558-569. Springer, Berlin. · Zbl 1098.68878
[25] Walker, G. (2000). Robust, non-parametric and automatic methods for matching spatial point patterns. Ph.D. thesis, Univ. Leeds, Leeds, UK.
[26] Zvelebil, M. and Baum, J. O. (2007). Understanding Bioinformatics . Garland Science, New York. · Zbl 1321.92017
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.