On camera calibration with linear programming and loop constraint linearization. (English) Zbl 1235.68252

Summary: A technique for calibrating a network of perspective cameras based on their graph of trifocal tensors is presented. After estimating a set of reliable epipolar geometries, a parameterization of the graph of trifocal tensors is proposed in which each trifocal tensor is linearly encoded by a 4-vector. The strength of this parameterization is that the homographies relating two adjacent trifocal tensors, as well as the projection matrices depend linearly on the parameters. Two methods for estimating these parameters in a global way taking into account loops in the graph are developed. Both methods are based on sequential linear programming: the first relies on a locally linear approximation of the polynomials involved in the loop constraints whereas the second uses alternating minimization. Both methods have the advantage of being non-incremental and of uniformly distributing the error across all the cameras. Experiments carried out on several real data sets demonstrate the accuracy of the proposed approach and its efficiency in distributing errors over the whole set of cameras.


68T45 Machine vision and scene understanding


ParallAX; SIFT
Full Text: DOI


[1] Agarwal, S., Snavely, N., Simon, I., Seitz, S. M., & Szeliski, R. (2009). Building Rome in a day. In ICCV.
[2] Avidan, S., & Shashua, A. (2001). Threading fundamental matrices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 73–77. · Zbl 05110356
[3] Bertsekas, D. (1999). Nonlinear programming (2nd edn.). Belmont: Athena Scientific. · Zbl 1015.90077
[4] Bujnak, M., Kukelova, Z., & Pajdla, T. (2009). 3D reconstruction from image collections with a single known focal length. In ICCV (pp. 351–358).
[5] Chum, O., & Matas, J. (2005). Matching with PROSAC: progressive sample consensus. In CVPR (I: pp. 220–226).
[6] Chum, O., Werner, T., & Matas, J. (2005). Two-view geometry estimation unaffected by a dominant plane. In CVPR I: pp. 772–779.
[7] Cornelis, N., Cornelis, K., & Van Gool, L. (2006). Fast compact city modeling for navigation pre-visualization. In CVPR.
[8] Courchay, J., Dalalyan, A. S., Keriven, R., & Sturm, P. (2010). Exploiting loops in the graph of trifocal tensors for calibrating a network of cameras. In ECCV.
[9] Estrada, C., Neira, J., & Tardós, J. D. (2005). Hierarchical SLAM: real-time accurate mapping of large environments. IEEE Transactions on Robotics, 21, 588–596.
[10] Faugeras, O., Luong, Q. T., & Papadopoulou, T. (2001). The geometry of multiple images: the laws that govern the formation of images of a scene and some of their applications. Cambridge: MIT Press. · Zbl 1002.68183
[11] Fitzgibbon, A. W., & Zisserman, A. (1998). Automatic camera recovery for closed or open image sequences. In ECCV (pp. 311–326).
[12] Furukawa, Y., & Ponce, J. (2009). Accurate camera calibration from multi-view stereo and bundle adjustment. International Journal of Computer Vision, 84, 257–268. · Zbl 05671912
[13] Gherardi, R., Farenzena, M., & Fusiello, A. (2010). Improving the efficiency of hierarchical structure-and-motion. In CVPR (pp. 1594–1600).
[14] Golub, G. H., & Van Loan, C. F. (1996). Matrix computations. 3rd edn. Johns Hopkins studies in the mathematical sciences. Baltimore: Johns Hopkins University Press.
[15] Govindu, M. (2006). Robustness in motion averaging. In ACCV (pp. 457–466).
[16] Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision (2nd edn.). Cambridge: Cambridge University Press. · Zbl 0956.68149
[17] Havlena, M., Torii, A., Knopp, J., & Pajdla, T. (2009). Randomized structure from motion based on atomic 3d models from camera triplets. In CVPR.
[18] Jacobs, D. (1997). Linear fitting with missing data: applications to structure-from-motion and to characterizing intensity images. In CVPR (p. 206).
[19] Klopschitz, M., Zach, C., Irschara, A., & Schmalstieg, D. (2008). Generalized detection and merging of loop closures for video sequences. In 3DPVT.
[20] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60, 91–110. · Zbl 02244065
[21] Martinec, D., & Pajdla, T. (2005). 3D reconstruction by fitting low-rank matrices with missing data. In CVPR (pp. 198–205).
[22] Martinec, D., & Pajdla, T. (2007). Robust rotation and translation estimation in multiview reconstruction. In CVPR.
[23] Maybank, S. J., & Shashua, A. (1998). Ambiguity in reconstruction from images of six points. In ICCV (pp. 703–708).
[24] Mouragnon, E., Lhuillier, M., Dhome, M., Dekeyser, F., & Sayd, P. (2009). Generic and real-time sfm using local bundle adjustment. Image and Vision Computing, 27, 1178–1193. · Zbl 05842165
[25] Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., & Koch, R. (2004). Visual modeling with a hand-held camera. International Journal of Computer Vision, 59, 207–232.
[26] Ponce, J., McHenry, K., Papadopoulo, T., Teillaud, M., & Triggs, B. (2005). On the absolute quadratic complex and its application to autocalibration. In CVPR (pp. 780–787). .
[27] Quan, L. (1995). Invariants of six points and projective reconstruction from three uncalibrated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 34–46. · Zbl 05110504
[28] Sattler, T., Leibe, B., & Kobbelt, L. (2009). Scramsac: improving ransac’s efficiency with a spatial consistency filter. In ICCV.
[29] Scaramuzza, D., Fraundorfer, F., & Pollefeys, M. (2010). Closing the loop in appearance-guided omnidirectional visual odometry by using vocabulary trees. Robotics and Autonomous Systems, 58, 820–827. · Zbl 05725093
[30] Schaffalitzky, F., Zisserman, A., Hartley, R. I., & Torr, P. H. S. (2000). A six point solution for structure and motion. In ECCV, London, UK (pp. 632–648). Berlin: Springer.
[31] Sinha, S. N., Pollefeys, M., & McMillan, L. (2004). Camera network calibration from dynamic silhouettes. In CVPR.
[32] Snavely, N., Seitz, S. M., & Szeliski, R. (2006). Photo tourism: exploring photo collections in 3D. New York: ACM Press.
[33] Snavely, N., Seitz, S. M., & Szeliski, R. (2008). Modeling the world from Internet photo collections. International Journal of Computer Vision, 80, 189–210. · Zbl 05671828
[34] Snavely, N., Seitz, S., & Szeliski, R. (2008). Skeletal sets for efficient structure from motion. In CVPR (pp. 1–8).
[35] Strecha, C., von Hansen, W., Van Gool, L., Fua, P., & Thoennessen, U. (2008). On benchmarking camera calibration and multi-view stereo for high resolution imagery. In CVPR.
[36] Sturm, P., & Triggs, B. (1996). A factorization based algorithm for multi-image projective structure and motion. In ECCV (pp. 709–720).
[37] Tardif, J., Pavlidis, Y., & Daniilidis, K. (2008). Monocular visual odometry in urban environments using an omnidirectional camera. In IROS (pp. 2531–2538).
[38] Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography: a factorization method. International Journal of Computer Vision, 9, 137–154.
[39] Torii, A., Havlena, M., & Pajdla, T. (2009). From Google street view to 3d city models. In OMNIVIS.
[40] Torr, P. H. S. (1997). An assessment of information criteria for motion model selection. In CVPR (pp. 47–53).
[41] Triggs, B., McLauchlan, P., Hartley, R., & Fitzgibbon, A. (1999). Bundle adjustment–a modern synthesis. In Workshop on vision algorithms (pp. 298–372).
[42] Vu, H., Keriven, R., Labatut, P., & Pons, J. P. (2009). Towards high-resolution large-scale multi-view stereo. In CVPR.
[43] Zach, C., Klopschitz, M., & Pollefeys, M. (2010). Disambiguating visual relations using loop constraints. In CVPR (pp. 1426–1433).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.