×

Automatic object extraction and reconstruction in active video. (English) Zbl 1132.68645

Summary: A new method of video object extraction is proposed to automatically extract the object of interest from actively acquired videos. Traditional video object extraction techniques often operate under the assumption of homogeneous object motion and extract various parts of the video that are motion consistent as objects. In contrast, the proposed Active Video Object Extraction (AVOE) approach assumes that the object of interest is being actively tracked by a non-calibrated camera under general motion and classifies the possible movements of the camera that result in the 2D motion patterns as recovered from the image sequence. Consequently, the AVOE method is able to extract the single object of interest from the active video. We formalize the AVOE process using notions from Gestalt psychology. We define a new Gestalt factor called “shift and hold” and present 2D object extraction algorithms. Moreover, since an active video sequence naturally contains multiple views of the object of interest, we demonstrate that these views can be combined to form a single 3D object regardless of whether the object is static or moving in the video.

MSC:

68T10 Pattern recognition, speech recognition

Software:

SBA; ParallAX
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Jiang, H.; Li, Z. N.; Drew, M. S., Recognizing posture in pictures with successive convexification and linear programming, IEEE Multimedia, 14, 2, 26-37 (2007)
[2] Marr, D., Vision (1982), W.H. Freeman and Co.: W.H. Freeman and Co. New York
[3] Aloimonos, Y., Active Perception (1993), Lawrence Erlbaum Associates Publishers: Lawrence Erlbaum Associates Publishers London
[4] D.H. Ballard, Animate vision, Artif. Intell. (48) (1991) 57-86.; D.H. Ballard, Animate vision, Artif. Intell. (48) (1991) 57-86.
[5] Wei, J.; Li, Z. N., On active camera control and camera motion recovery with foveated wavelet transform, IEEE Trans. Pattern Anal. Mach. Intell., 23, 8, 896-903 (2001)
[6] Forsyth, D. A.; Ponce, J., Computer Vision: A Modern Approach (2003), Prentice-Hall: Prentice-Hall Upper Saddle River, NJ
[7] Gottschaldt, K., Gestalt factors and repetition, (A Source Book of Gestalt Psychology (1938), Kegan Paul, Trench, Trubner and Co, Ltd)
[8] Rock, I., An Introduction to Perception (1975), Macmillan Publishing, Co. Inc.: Macmillan Publishing, Co. Inc. New York
[9] Vecera, S. P.; O’Reill, R. C., Figure-ground organization and object recognition processes: an interactive account, J. Exp. Psychol.: Hum. Percept. Perform., 24, 441-462 (1998)
[10] McClelland, J. L., Putting knowledge in its place: a scheme for programming parallel processing structures on the fly, Cognitive Sci., 9, 113-146 (1985)
[11] F. Restle, Coding theory of the perception of motion configuration, Psychol. Rev. (1979) 1-24.; F. Restle, Coding theory of the perception of motion configuration, Psychol. Rev. (1979) 1-24.
[12] Bolduc, M.; Levine, M. D., A review of biologically motivated space-variant data reduction models for robotic vision, Comput. Vision Image Understanding, 69, 2, 170-184 (1998)
[13] Lu, Y.; Zhang, J. Z.; Wu, Q. M.J.; Li, Z. N., A survey of motion-parallax-based 3D reconstruction algorithms, IEEE Trans. Syst. Man Cybern., 34, 4, 532-548 (2004)
[14] Seitz, S. M., A comparison and evaluation of multi-view stereo reconstruction algorithms, (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’06) (2006))
[15] Y. Lu, Z.N. Li, Active video object extraction, in: Proceedings of IEEE International Conference on Multimedia and Expo, 2004.; Y. Lu, Z.N. Li, Active video object extraction, in: Proceedings of IEEE International Conference on Multimedia and Expo, 2004.
[16] Y. Lu, Automatic object extraction and reconstruction in active video, Ph.D. Thesis, Simon Fraser University, 2005.; Y. Lu, Automatic object extraction and reconstruction in active video, Ph.D. Thesis, Simon Fraser University, 2005.
[17] Zhang, H. J.; Kankanhlli, A.; Smoliar, S. W., Automatic partitioning of full-motion video, ACM Multimedia Syst., 1, 1 (1993)
[18] Pei, S. C.; Yu, Z., Efficient MPEG compressed video analysis using macro block type information, IEEE Trans. Multimedia, 1, 4, 321-333 (1999)
[19] Rui, Y.; Huang, T. S.; Mehrotra, S., Constructing table of content for videos, ACM Multimedia Syst. J. (Special Issue Multimedia Systems on Video Libraries), 7, 5, 359-368 (1999)
[20] Gargi, U.; Kasturi, R.; Strayer, S. H., Performance characterization of video shot change detection methods, IEEE Trans. Circuits Syst. Video Technol., 10, 1, 1-13 (2000)
[21] Mascelli, J. V., The Five C’s of Cinematography (1966), Radstone Publications: Radstone Publications North Hollywood, CA
[22] Torr, P. H.S.; Murray, D. W., Statistical detection of independent movement from a moving camera, Image Vision Comput., 1, 4, 180-187 (1993)
[23] Beardsley, P.; Zisserman, A.; Murray, D., Sequential updating of projective and affine structure from motion, Int. J. Comput. Vision, 23, 3, 235-259 (1997)
[24] Avidan, S.; Shashua, A., Threading fundamental matrices, IEEE Trans. Pattern Anal. Mach. Intell., 23, 1, 73-77 (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.