×

Boosted string representation and its application to video surveillance. (English) Zbl 1147.68676

Summary: This paper presents a new behavior analysis system for analyzing human movements via a boosted string representation. First of all, we propose a triangulation-based method to transform each action sequence into a set of symbols. Then, an action sequence can be interpreted and analyzed using this string representation. To analyze action sequences with this string representation, three practical problems should be tackled. Usually, an action sequence has different temporal scaling changes, different initial states, and symbol converting errors. Traditional methods (like hidden Markov models and finite state machines) have limited abilities to deal with the above problems since many unknown states should be constructed and initialized. To tackle the problems, a novel string hypothesis generator is then proposed for generating a bank of string features from which different invariant features can be learned for classifying behaviors more accurately. To learn the invariant features, the Adaboost algorithm is used and modified to train a strong classifier from the set of string hypotheses so that multiple human action events can be well classified. In addition, a forward classification scheme is proposed to classify all input action sequences more accurately even though they have various scaling changes and coding errors. Experimental results prove that the proposed method is a robust, accurate, and powerful tool for human movement analysis.

MSC:

68T10 Pattern recognition, speech recognition

Software:

AdaBoost.MH
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Hu, W.; Tan, T.-N.; Wang, L.; Maybank, S., A survey on visual surveillance of object motion and behaviors, IEEE Trans. Sys. Man Cybern.—Part C: Appl. Rev., 34, 3, 334-352 (2004)
[2] J. Aggarwal, S. Park, Human motion: modeling and recognition of actions and interactions, in: Proceedings of the 2nd International Symposium on 3D Data Processing, Visualization, and Transmission 2004, pp. 640-647.; J. Aggarwal, S. Park, Human motion: modeling and recognition of actions and interactions, in: Proceedings of the 2nd International Symposium on 3D Data Processing, Visualization, and Transmission 2004, pp. 640-647.
[3] Oliver, N. M.; Rosario, B.; Pentland, A. P., A Bayesian computer vision system for modeling human interactions, IEEE Trans. Pattern Anal. Mach. Intell., 22, 8, 831-843 (2000)
[4] Ivanov, Y. A.; Bobick, A. F., Recognition of visual activities and interactions by stochastic parsing, IEEE Trans. Pattern Anal. Mach. Intell., 22, 8, 852-872 (2000)
[5] Park, S.; Aggarwal, J. K., A hierarchical Bayesian network for event recognition of human actions and interactions, Multimedia Syst., 10, 164-179 (2004)
[6] Haering, N.; Qian, R. J.; Sezan, M. I., A semantic event-detection approach and its application to detecting hunts in wildlife video, IEEE Trans. Circuits Syst. Video Technol., 10, 6, 857-868 (2000)
[7] S. Hongeng, F. Bremond, R. Nevatia, Representation and optimal recognition of human activities, in: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, 2000, pp. 818-825.; S. Hongeng, F. Bremond, R. Nevatia, Representation and optimal recognition of human activities, in: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, 2000, pp. 818-825.
[8] N.T. Nguyen, H.H. Bui, S. Venkatesh, G. West, Recognition and monitoring high-level behaviours in complex spatial environments, IEEE International Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, vol. 2, June 2003, pp. 620-625.; N.T. Nguyen, H.H. Bui, S. Venkatesh, G. West, Recognition and monitoring high-level behaviours in complex spatial environments, IEEE International Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, vol. 2, June 2003, pp. 620-625.
[9] R. Navaratnam, A. Thayananthan, P.H.S. Torr, R. Cipolla, Hierarchical part-based human body pose estimation, in: Proceedings of the British Machine Vision Conference, vol. 1, Oxford, UK, September 2005, pp. 479-488.; R. Navaratnam, A. Thayananthan, P.H.S. Torr, R. Cipolla, Hierarchical part-based human body pose estimation, in: Proceedings of the British Machine Vision Conference, vol. 1, Oxford, UK, September 2005, pp. 479-488.
[10] Cucchiara, R.; Grana, C.; Prati, A.; Vezzani, R., Probabilities posture classification for human-behavior analysis, IEEE Trans. Syst. Man Cybern.—Part A: Syst. Humans, 35, 1, 42-54 (2005)
[11] Cucchiara, R.; Prati, A.; Vezzani, R., An intelligent surveillance system for Dangerous situation detection in home environments, Intell. Artif., 1, 1, 11-15 (2004)
[12] J. Zhang, J. Luo, R. Collins, Y. Liu, Body localization in still images using hierarchical models and hybrid search, in: IEEE International Conference on Computer Vision and Pattern Recognition, vol. 2, 2006, pp. 1536-1543.; J. Zhang, J. Luo, R. Collins, Y. Liu, Body localization in still images using hierarchical models and hybrid search, in: IEEE International Conference on Computer Vision and Pattern Recognition, vol. 2, 2006, pp. 1536-1543.
[13] S. Park, J. Park, J.K. Aggarwal, Video retrieval of human interactions using model-based motion tracking and multi-layer finite state automata, in: Lecture Notes in Computer Science, vol. 2728 Springer, Berlin, Heidelberg, New York, 2003, pp. 394-403.; S. Park, J. Park, J.K. Aggarwal, Video retrieval of human interactions using model-based motion tracking and multi-layer finite state automata, in: Lecture Notes in Computer Science, vol. 2728 Springer, Berlin, Heidelberg, New York, 2003, pp. 394-403. · Zbl 1029.68846
[14] P. Jihun, P. Sunghun, J.K. Aggarwal, Model-based human motion tracking and behavior recognition using hierarchical finite state automata, International Conference on Computational Science and Its Applications, Glasgow, UK, May 8-11, 2004, pp. 311-320.; P. Jihun, P. Sunghun, J.K. Aggarwal, Model-based human motion tracking and behavior recognition using hierarchical finite state automata, International Conference on Computational Science and Its Applications, Glasgow, UK, May 8-11, 2004, pp. 311-320.
[15] Wada, T.; Matsuyama, T., Multiobject behavior recognition by event driven selective attention method, IEEE Trans. Pattern Anal. Mach. Intell., 22, 8, 873-887 (2000)
[16] A.S. Ogale, A. Karapurkar, Y. Aloimonos, View-invariant modeling and recognition of human actions using grammars, in: Workshop on Dynamical. Vision at ICCV’05, October 2005.; A.S. Ogale, A. Karapurkar, Y. Aloimonos, View-invariant modeling and recognition of human actions using grammars, in: Workshop on Dynamical. Vision at ICCV’05, October 2005.
[17] M. Brand, Understanding manipulation in video, in: Proceedings of Second International Conference on Face and Gesture Recognition, 1997, pp. 94-99.; M. Brand, Understanding manipulation in video, in: Proceedings of Second International Conference on Face and Gesture Recognition, 1997, pp. 94-99.
[18] Kojima, A.; Tamura, T., Natural language description of human activities from video images based on concept hierarchy of actions, Int. J. Comput. Vision, 50, 2, 171-184 (2002) · Zbl 1012.68781
[19] Galata, A.; Johnson, N.; Hogg, D. C., Learning variable-length Markov models of behavior, Comput Vision Image Understanding, 81, 3, 398-413 (2001) · Zbl 1011.68551
[20] A. Galata, N. Johnson, D. Hogg, Learning structured behaviour models using variable length Markov models, in: IEEE International Workshop on Modeling People, Corfu, Greece, September 1999, pp. 95-102.; A. Galata, N. Johnson, D. Hogg, Learning structured behaviour models using variable length Markov models, in: IEEE International Workshop on Modeling People, Corfu, Greece, September 1999, pp. 95-102.
[21] Brand, M.; Kettnaker, V., Discovery and segmentation of activities in video, IEEE Trans. Pattern Anal. Mach. Intell., 22, 844-851 (2000)
[22] N. Jojic, et al., Transformed hidden Markov models: estimating mixture models and inferring spatial transformations in video sequences, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, vol. 2, Hilton Head, SC, June 2000, pp. 26-33.; N. Jojic, et al., Transformed hidden Markov models: estimating mixture models and inferring spatial transformations in video sequences, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, vol. 2, Hilton Head, SC, June 2000, pp. 26-33.
[23] Y.T. Hsu, J.W. Hsieh, H.Y. Liao, Human behavior analysis using deformable triangulations, IEEE International Workshop on Multimedia Signal Processing, Shanhai, China, November 2005.; Y.T. Hsu, J.W. Hsieh, H.Y. Liao, Human behavior analysis using deformable triangulations, IEEE International Workshop on Multimedia Signal Processing, Shanhai, China, November 2005.
[24] Chew, L. P., Constrained delaunay triangulations, Algorithmica, 4, 1, 97-108 (1989) · Zbl 0664.68042
[25] Chris Stauffer, S.; Eric Grimson, S., Learning patterns of activity using real-time tracking, IEEE Trans Pattern Recognition Mach. Intell., 22, 8, 747-757 (2000)
[26] Sonka, M.; Hlavac, V.; Boyle, R., Image Processing, Analysis and Machine Vision (1993), Chapman & Hall: Chapman & Hall London, UK
[27] Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, in: Proceedings of the Second European Conference on Computational Learning Theory, Springer, 1995, pp. 23-37.; Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, in: Proceedings of the Second European Conference on Computational Learning Theory, Springer, 1995, pp. 23-37.
[28] Dietterich, T. G.; Bakiri, G., Solving multiclass learning problems via error-correcting output codes, J. Artif. Intell. Res., 2, 263-286 (1995) · Zbl 0900.68358
[29] \( \langle;\) http://mmplab.eed.yzu.edu.tw/download/Behavior_Analysis_Results.rar \(\rangle;\); \( \langle;\) http://mmplab.eed.yzu.edu.tw/download/Behavior_Analysis_Results.rar \(\rangle;\)
[30] \( \langle;\) http://mmplab.eed.yzu.edu.tw/download/Behavior_Analysis.rar \(\rangle;\); \( \langle;\) http://mmplab.eed.yzu.edu.tw/download/Behavior_Analysis.rar \(\rangle;\)
[31] Belongie, S.; Malik, J.; Puzicha, J., Shape matching and object recognition using shape contexts, IEEE Trans. Pattern Recognition Mach. Intell., 24, 4, 509-522 (2002)
[32] J. Yamato, J. Ohya, K. Ishii, Recognizing human action in time-sequential images using hidden Markov model, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, IL, USA, June 1992, pp. 379-383.; J. Yamato, J. Ohya, K. Ishii, Recognizing human action in time-sequential images using hidden Markov model, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, IL, USA, June 1992, pp. 379-383.
[33] F. Lv, R. Nevatia, Single view human action recognition using key pose matching and Viterbi path searching, in: IEEE Conference on Computer Vision and Pattern Recognition, 17-22 June 2007, pp. 1-8.; F. Lv, R. Nevatia, Single view human action recognition using key pose matching and Viterbi path searching, in: IEEE Conference on Computer Vision and Pattern Recognition, 17-22 June 2007, pp. 1-8.
[34] Rabiner, L. R., A tutorial on Hidden Markov Models and selected applications in speech recognition, Proc. IEEE, 77, 2 (1989)
[35] \( \langle;\) http://en.wikipedia.org/wiki/Poser \(\rangle;\); \( \langle;\) http://en.wikipedia.org/wiki/Poser \(\rangle;\)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.