×

Two-view motion segmentation with model selection and outlier removal by RANSAC-enhanced Dirichlet process mixture models. (English) Zbl 1477.68377

Summary: We propose a novel motion segmentation algorithm based on mixture of Dirichlet process (MDP) models. In contrast to previous approaches, we consider motion segmentation and its model selection regarding to the number of motion models as an inseparable problem. Our algorithm can simultaneously infer the number of motion models, estimate the cluster memberships of correspondences, and identify the outliers. The main idea is to use MDP models to fully exploit the geometric consistencies before making premature decisions about the number of motion models. To handle outliers, we incorporate RANSAC into the inference process of MDP models. In the experiments, we compare the proposed algorithm with naive RANSAC, GPCA and Schindler’s method on both synthetic data and real image data. The experimental results show that we can handle more motions and have satisfactory performance in the presence of various levels of noise and outlier.

MSC:

68T45 Machine vision and scene understanding
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

SIFT; FIVEPOINT
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Adiv, G. (1985). Determining 3-dimensional motion and structure from optical-flow generated by several moving-objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7(4), 384–401. · doi:10.1109/TPAMI.1985.4767678
[2] Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics, 2(6), 1152–1174. · Zbl 0335.60034 · doi:10.1214/aos/1176342871
[3] Ballard, D. H., & Kimball, O. A. (1983). Rigid body motion from depth and optical-flow. Computer Vision Graphics and Image Processing, 22(1), 95–115. · doi:10.1016/0734-189X(83)90097-X
[4] Bober, M., & Kittler, J. (1994). Robust motion analysis. In Proc. IEEE conf. on computer vision and pattern recognition (pp. 947–952).
[5] Chum, O., Matas, J., & Kittler, J. (2003). Locally optimized ransac. In Proc. 25th DAGM symposium (Vol. 2781, pp. 236–243).
[6] Costeira, J. P., & Kanade, T. (1998). A multibody factorization method for independently moving objects. International Journal of Computer Vision, 29(3), 159–179. · Zbl 05470335 · doi:10.1023/A:1008000628999
[7] Escobar, M. D., & West, M. (1995). Bayesian density-estimation and inference using mixtures. Journal of the American Statistical Association, 90(430), 577–588. · Zbl 0826.62021 · doi:10.1080/01621459.1995.10476550
[8] Ferguson, T. (1973). A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1(2), 209–230. · Zbl 0255.62037 · doi:10.1214/aos/1176342360
[9] Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395. · doi:10.1145/358669.358692
[10] Gruber, A., & Weiss, Y. (2006). Incorporating non-motion cues into 3D motion segmentation. In Proc. European conference on computer vision.
[11] Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd edn.). Cambridge: Cambridge University Press. · Zbl 1072.68104
[12] Horn, B. (1986). Robot vision. New York: McGraw-Hill.
[13] Illingworth, J., & Kittler, J. (1987). The adaptive hough transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(5), 690–698. · doi:10.1109/TPAMI.1987.4767964
[14] Jian, Y. D., & Chen, C. S. (2007). Two-view motion segmentation by mixtures of Dirichlet process with model selection and outlier removal. In Proc. international conference on computer vision.
[15] Kanatani, K. (2002). Evaluation and selection of models for motion segmentation. In Proc. European conference on computer vision (Vol. 2352, pp. 335–349). · Zbl 1039.68662
[16] Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Learning layered motion segmentations of video. In Proc. international conference on computer vision (Vol. 1, pp. 33–40).
[17] Li, H. W., Lavin, M. A., & Lemaster, R. J. (1986). Fast hough transform–a hierarchical approach. Computer Vision Graphics and Image Processing, 36(2–3), 139–161. · doi:10.1016/0734-189X(86)90073-3
[18] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. · Zbl 02244065 · doi:10.1023/B:VISI.0000029664.99615.94
[19] MacEachern, S. N., & Muller, P. (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7(2), 223–238.
[20] Makadia, A., Geyer, C., Sastry, S., & Daniilidis, K. (2005). Radon-based structure from motion without correspondences. In Proc. IEEE conf. on computer vision and pattern recognition.
[21] Morita, T., & Kanade, T. (1997). A sequential factorization method for recovering shape and motion from image streams. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(8), 858–867. · Zbl 05112142 · doi:10.1109/34.608289
[22] Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture. Journal of Computational and Graphical Statistics, 9(2), 249–265.
[23] Nister, D. (2004). An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 756–770. · Zbl 05111562 · doi:10.1109/TPAMI.2004.17
[24] Orbanz, P., & Buhmann, J. M. (2006). Smooth image segmentation by nonparametric Bayesian inference. In Proc. European conference on computer vision.
[25] Schindler, K., & Suter, D. (2006). Two-view multibody structure-and-motion with outliers through model selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 983–995. · Zbl 05111595 · doi:10.1109/TPAMI.2006.130
[26] Schindler, K., Suter, D., & Wang, H. (2008). A model-selection framework for multibody structure-and-motion of image sequences. International Journal of Computer Vision, 79(2), 159–177. · Zbl 05322251 · doi:10.1007/s11263-007-0111-7
[27] Shashua, A., Zass, R., & Hazan, T. (2006). Multi-way clustering using super-symmetric non-negative tensor factorization. In Proc. European conference on computer vision.
[28] Stewenius, H., Engels, C., & Nister, D. (2006). Recent developments on direct relative orientation. ISPRS Journal of Photogrammetry and Remote Sensing, 60(4), 284–294. · doi:10.1016/j.isprsjprs.2006.03.005
[29] Sudderth, E., Torralba, A., Freeman, W., & Willsky, A. (2006). Depth from familiar objects: a hierarchical model for 3D scenes. In Proc. IEEE conf. on computer vision and pattern recognition (Vol. 2, pp. 2410–2417).
[30] Sugaya, Y., & Kanatani, K. (2004). Multi-stage unsupervised learning for multi-body motion segmentation. IEICE Transactions on Information and Systems, E87d(7), 1935–1942. · Zbl 1098.68868
[31] Tian, T. Y., & Shah, M. (1997). Recovering 3D motion of multiple objects using adaptive hough transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(10), 1178–1183. · doi:10.1109/34.625131
[32] Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography–a factorization method. International Journal of Computer Vision, 9(2), 137–154. · doi:10.1007/BF00129684
[33] Torr, P. H. S. (1998). Geometric motion segmentation and model selection. Philosophical Transactions of the Royal Society of London Series A, 356(1740), 1321–1338. · Zbl 0902.68201 · doi:10.1098/rsta.1998.0224
[34] Tron, R., & Vidal, R. (2007). A benchmark for the comparison of 3D motion segmentation algorithms. In Proc. IEEE conf. on computer vision and pattern recognition.
[35] Tuzel, O., Subbarao, R., & Meer, P. (2005). Simultaneous multiple 3D motion estimation via mode finding on lie groups. In Proc. international conference on computer vision (Vol. 1, pp. 18–25).
[36] Vidal, R., & Hartley, R. (2004). Motion segmentation with missing data using powerfactorization and gpca. In Proc. IEEE conf. on computer vision and pattern recognition.
[37] Vidal, R., Ma, Y., Soatto, S., & Sastry, S. (2006). Two-view multibody structure from motion. International Journal of Computer Vision, 68(1), 7–25. · Zbl 1477.68498 · doi:10.1007/s11263-005-4839-7
[38] Wills, J., Agarwal, S., & Belongie, S. (2006). A feature-based approach for dense segmentation and estimation of large disparity motion. International Journal of Computer Vision, 68(2), 125–143. · Zbl 05062702 · doi:10.1007/s11263-006-6660-3
[39] Wolf, L., & Shashua, A. (2001). Two-body segmentation from two perspective views. In Proc. IEEE conf. on computer vision and pattern recognition (Vol. 1, pp. 263–270).
[40] Xiao, J. J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644–1659. · Zbl 05112485 · doi:10.1109/TPAMI.2005.202
[41] Yan, J. Y., & Pollefeys, M. (2006). A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In Proc. European conference on computer vision.
[42] Yang, A., Rao, S., & Ma, Y. (2006). Robust statistical estimation and segmentation of multiple subspaces. In Proc. IEEE workshop on 25 years of RANSAC (joint with CVPR).
[43] Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1330–1334. · doi:10.1109/34.888718
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.