×

Efficient three-dimensional scene modeling and mosaicing. (English) Zbl 1243.68304

Summary: Scene modeling has a key role in applications ranging from visual mapping to augmented reality. This paper presents an end-to-end solution for creating accurate three-dimensional (3D) textured models using monocular video sequences. The methods are developed within the framework of sequential structure from motion, in which a 3D model of the environment is maintained and updated as new visual information becomes available. The proposed approach contains contributions at different levels. The camera pose is recovered by directly associating the 3D scene model with local image observations, using a dual-registration approach. Compared to the standard structure from motion techniques, this approach decreases the error accumulation while increasing the robustness to scene occlusions and feature association failures, while allowing 3D reconstructions for any type of scene. Motivated by the need to map large areas, a novel 3D vertex selection mechanism is proposed, which takes into account the geometry of the scene. Vertices are selected not only to have high reconstruction accuracy but also to be representative of the local shape of the scene. This results in a reduction in the complexity of the final 3D model, with minimal loss of precision. As a final step, a composite visual map of the scene (mosaic) is generated. We present a method for blending image textures using 3D geometric information and photometric differences between registered textures. The method allows high-quality mosaicing over 3D surfaces by reducing the effects of the distortions induced by camera viewpoint and illumination changes. The results are presented for four scene modeling scenarios, including a comparison with ground truth under a realistic scenario and a challenging underwater data set. Although developed primarily for underwater mapping applications, the methods are general and applicable to other domains, such as aerial and land-based mapping.

MSC:

68T45 Machine vision and scene understanding
68T40 Artificial intelligence for robotics

Software:

SIFT; MESH; SBA
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., & Cohen, M. (2004, August). Interactive digital photomontage. In Proceedings SIGGRAPH04, Los Angeles, CA.
[2] Armangue, Overall view regarding fundamental matrix estimation, Image and Vision Computing 21 pp 205– (2003)
[3] Arya, An optimal algorithm for approximate nearest neighbor searching, Journal of the Association for Computing Machinery 45 pp 891– (1998) · Zbl 1065.68650 · doi:10.1145/293347.293348
[4] Baumberg, A. (2000, June). Reliable feature matching across widely separated views. In IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, SC (pp. 774-781).
[5] Baumberg, A. (2002, September). Blending images for texturing 3D models. In Proceedings of the British Machine Vision Conference, Cardiff, UK.
[6] Beardsley, P. A., Zisserman, A., & Murray, D. W. (1994, May). Navigation using affine structure from motion. In European Conference on Computer Vision, Stockholm, Sweden (vol. 801, pp. 85-96).
[7] Boykov, Y., Veksler, O., & Zabih, R. (1998, June). Markov random fields with efficient approximations. In IEEE Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA (pp. 648-655).
[8] Brown, M., & Lowe, D. (2005, May). Unsupervised 3D object recognition and reconstruction in unordered datasets. In 3D Digital Imaging and Modeling, Ottawa, Canada (pp. 56-63).
[9] Burt, A multiresolution spline with application to image mosaics, ACM Transactions on Graphics 2 (4) pp 217– (1983)
[10] Davis, J. (1998, June). Mosaics of scenes with moving objects. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA.
[11] Delaunoy, O., Gracias, N., & Garcia, R. (2008, April). Towards detecting changes in underwater image sequences. MTS/IEEE OCEANS Conference, Kobe, Japan.
[12] Deschênes, Detection of line junctions and line terminations using curvilinear features, Pattern Recognition Letters 21 pp 637– (2000)
[13] Eck, M., DeRose, T., Duchamp, T., Hoppe, H., Lounsbery, M., & Stuetzle, W. (1995, August). Multiresolution analysis of arbitrary meshes. In SIGGRAPH, Los Angeles, CA (vol. 29, pp. 173-182).
[14] Eustice, Exactly sparse delayed-state filters for view-based SLAM, IEEE Transactions on Robotics 22 pp 1100– (2006)
[15] Faugeras, Motion and structure from motion in a piecewise planar environment, IEEE Transactions on Pattern Analysis and Machine Intelligence 2 pp 485– (1988)
[16] Fitzgibbon, A., & Zisserman, A. (1998, June). Automatic camera recovery for closed or open image sequences. In European Conference on Computer Vision, Freiburg, Germany (pp. 311-326).
[17] Floater, Surface parameterization: A tutorial and survey. In Advances in multiresolution for geometric modelling pp 157– (2005) · Zbl 1065.65030
[18] Fowler, Automatic extraction of irregular network digital terrain models, SIGGRAPH Computer Graphics 13 (2) pp 199– (1979)
[19] Garcia, R., Nicosevici, T., & Cufí, X. (2002, October). On the way to solve lighting problems in underwater imaging. In MTS/IEEE OCEANS Conference, Biloxi, MS (pp. 1018-1024).
[20] Garland, M., & Heckbert, P. S. (1998, October). Simplifying surfaces with color and texture using quadric error metrics. In Proceedings Visualization, Research Triangle Park, NC (pp. 99-188).
[21] Gonzalez, C., Castello, P., & Chover, M. (2007, March). A texture-based metric extension for simplification methods. In Proceedings GRAPP, Barcelona, Spain (pp. 69-76).
[22] Goshtasby, Fusion of multi-exposure images, Image and Vision Computing 23 (6) pp 611– (2005)
[23] Gracias, Fast image blending using watersheds and graph cuts, Image and Vision Computing 27 (5) pp 597– (2009)
[24] Gracias, N., Negahdaripour, S., Neumann, L., Prados, R., & Garcia, R. (2008, September). A motion compensated filtering approach to remove sunlight flicker in shallow water images. In MTS/IEEE OCEANS Conference, Quebec City, Canada.
[25] Greig, Exact maximum a posteriori estimation for binary images, Journal of the Royal Statistical Society 51 (2) pp 271– (1989)
[26] Gu, F., & Rzhanov, Y. (2006, September). Optimal image blending for underwater mosaics. In MTS/IEEE OCEANS Conference, Boston, MA.
[27] Haralick, Propagating covariance in computer vision, Pattern Recognition 1 pp 493– (1994)
[28] Hartley, Multiple view geometry in computer vision (2004) · Zbl 1072.68104 · doi:10.1017/CBO9780511811685
[29] Hoppe, H. (1996, August). Progressive meshes. In SIGGRAPH, Boston, MA (vol. 30, pp. 99-188).
[30] Hoppe, H. (1997, August). View-dependent refinement of progressive meshes. In SIGGRAPH, Los Angeles, CA (vol. 31, pp. 189-198).
[31] Jankó, Creating entirely textured 3D models of real objects using surface flattening, International Journal of Machine Graphics and Vision 14 (4) pp 379– (2005)
[32] Kolmogorov, What energy functions can be minimized via graph cuts, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (2) pp 147– (2004) · Zbl 1039.68666
[33] Kwatra, V., Schodl, A., Essa, I., Turk, G., & Bobick, A. (2003, July). Graphcut textures: Image and video synthesis using graph cuts. In SIGGRAPH, San Diego, CA (vol. 22, pp. 277-286).
[34] Lafortune, E. P., & Willems, Y. D. (1993, December). Bi-directional path tracing. In Proceedings Compugraphics, Alvor, Portugal.
[35] Lam, Thinning methodologies-A comprehensive survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 14 (9) pp 869– (1992)
[36] Levin, A. A., Zomet, S. P., & Weiss, Y. (2004, May). Seamless image stitching in the gradient domain. In Proceedings of the European Conference on Computer Vision (ECCV04), Prague, Czech Republic. · Zbl 1098.68804
[37] Li, Lazy snapping, ACM Transactions on Graphics 23 (3) pp 303– (2004)
[38] Lin, M. H. (2002). Surfaces with occlusions from layered stereo. Ph.D. thesis, Stanford University.
[39] Lourakis, M., & Deriche, R. (1999, January). Camera self-calibration using the singular value decomposition of the fundamental matrix. In Asian Conference on Computer Vision, Taipei, Taiwan (pp. 403-408).
[40] Low, K. L., & Tan, T. S. (1997, April). Model simplification using vertex clustering. In Symposium on Interactive 3D Graphics, Providence, RI (pp. 75-82).
[41] Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision 60 (2) pp 91– (2004)
[42] Ma, Y., Huang, K., & Kosecka, J. (2002, January). Rank deficiency condition of the multiple view matrix for mixed point and line features. In Asian Conference on Computer Vision, Melbourne, Australia.
[43] Mikolajczyk, A comparison of affine region detectors, International Conference on Computer Vision 65 pp 43– (2005)
[44] Najman, Geodesic saliency of watershed contours and hierarchical segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 18 (12) pp 1163– (1996)
[45] Negahdaripour, S., & Madjidi, H. (2003a, September). Robust optical flow estimation using underwater color images. In MTS/IEEE OCEANS Conference, San Diego, CA (vol. 4, pp. 2309-2316).
[46] Negahdaripour, Stereovision imaging on submersible platforms for 3-D mapping of benthic habitats and sea-floor structures, IEEE Journal of Oceanic Engineering 28 (4) pp 625– (2003)
[47] Negahdaripour, S., Xu, X., Khamene, A., & Awan, Z. (1998, August). 3-D motion and depth estimation from sea-floor images for mosaic-based station-keeping and navigation of ROVs/AUVs and high-resolution sea-floor mapping. In Workshop on Autonomous Underwater Vehicles AUV, Cambridge, MA (pp. 191-200).
[48] Nicosevici, T., Garcia, R., Negahdaripour, S., Kudzinava, M., & Ferrer, J. (2007, April). Identification of suitable interest points using geometric photometric cues in motion video for efficient 3-D environmental modeling. In IEEE International Conference on Robotics and Automation, Rome, Italy (pp. 4969-4974).
[49] Nicosevici, T., Negahdaripour, S., & Garcia, R. (2005, September). Monocular-based 3-D seafloor reconstruction and ortho-mosaicing by piecewise planar representation. In MTS/IEEE OCEANS Conference, Washington, DC (vol. 2, pp. 1279-1286).
[50] Pizarro, O. (2004). Large scale structure from motion for autonomous underwater vehicle surveys. Ph.D. thesis, Massachusetts Institute of Technology.
[51] Pizarro, O., Eustice, R., & Singh, H. (2004, November). Large area 3D reconstructions from underwater surveys. In MTS/IEEE OCEANS Conference, Kobe, Japan (pp. 678-687).
[52] Robert, L., & Faugeras, O. D. (1993, May). Relative 3D positioning and 3D convex hull computation from a weakly calibrated stereo pair. In International Conference on Computer Vision, Berlin, Germany (pp. 540-544).
[53] Rossignac, Multi-resolution 3D approximations for rendering complex scenes. Geometric Modeling in Computer Graphics pp 455– (1993)
[54] Roy, Stereo without epipolar lines: A maximum flow formulation, International Journal of Computer Vision 1 (2) pp 1– (1999)
[55] Ruiz, Concurrent mapping and localization using sidescan sonar, IEEE Journal of Oceanic Engineering 29 (2) pp 442– (2004)
[56] Sampson, Fitting conic sections to ”very scattered” data: An iterative refinement of the Bookstein algorithm, Computer Vision, Graphics and Image Processing 18 pp 97– (1982)
[57] Schroeder, W., Zarge, J., & Lorenson, W. (1992, July). Decimation of triangle meshes. In SIGGRAPH, Chicago, IL (vol. 26, pp. 65-70).
[58] Sheffer, Mesh parameterization methods and their applications, Foundations and Trends in Computer Graphics and Vision 2 (2) pp 227– (2006) · doi:10.1561/0600000011
[59] Snavely, Modeling the world from internet photo collections, International Journal of Computer Vision 80 (2) pp 189– (2008)
[60] Stone, W. C., Greenberg, R. J., Durda, D. D., & Franke, E. A. (2005, March). The DEPTHX Project: Pioneering technologies for exploration of extraterrestrial aqueous channels. In 36th Annual Lunar and Planetary Science Conference, League City, TX.
[61] Strasser, W. (1974). Flaechendarstellung auf graphischen Sichtgeraeten. Ph.D. thesis, TU Berlin.
[62] Vincent, Watersheds in digital spaces: An efficient algorithm based on immersion simulations, IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (6) pp 583– (1991)
[63] Williams, S. B., & Mahon, I. J. (2004, September). Design of an unmanned underwater vehicle for reef surveying. In 3rd IFAC Symposium on Mechatronic Systems, Sydney, Australia.
[64] Xu, A., Sun, S., & Xu, K. (2005, August). Texture information driven triangle mesh simplification. In Computer Graphics and Imaging, Honolulu, HI (pp. 43-48).
[65] Yoerger, Techniques for deep sea near bottom survey using an autonomous underwater vehicle, International Journal of Robotics Research 26 (2) pp 41– (2007) · Zbl 05744406 · doi:10.1177/0278364907073773
[66] Zhang, Iterative point matching for registration of free-form curves and surfaces, International Journal of Computer Vision 13 pp 119– (1994)
[67] Zhang, A flexible new technique for camera calibration, IEEE Transactions on Pattern Analysis and Machine Intelligence 22 pp 1330– (2000)
[68] Zigelman, Texture mapping using surface flattening via multidimensional scaling, IEEE Transactions on Visualization and Computer Graphics 8 (2) pp 198– (2002)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.