×

Robust facial feature tracking under varying face pose and facial expression. (English) Zbl 1118.68641

Summary: A hierarchical multi-state pose-dependent approach for facial feature detection and tracking under varying facial expression and face pose. For effective and efficient representation of feature points, a hybrid representation that integrates Gabor wavelets and gray-level profiles is proposed. To model the spatial relations among feature points, a hierarchical statistical face shape model is proposed to characterize both the global shape of human face and the local structural details of each facial component. Furthermore, multi-state local shape models are introduced to deal with shape variations of some facial components under different facial expressions. During detection and tracking, both facial component states and feature point positions, constrained by the hierarchical face shape model, are dynamically estimated using a switching hypothesized measurements model. Experimental results demonstrate that the proposed method accurately and robustly tracks facial features in real time under different facial expressions and face poses.

MSC:

68T10 Pattern recognition, speech recognition

Software:

FRGC
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] B.D. Lucas, T. Kanade, An iterative image registration technique with an application to stereo vision, Proceedings of International Joint Conference on Artificial Intelligence, 1981, pp. 674-679.; B.D. Lucas, T. Kanade, An iterative image registration technique with an application to stereo vision, Proceedings of International Joint Conference on Artificial Intelligence, 1981, pp. 674-679.
[2] J. Shi, C. Tomasi, Good features to track, Proceedings of CVPR94, 1994, pp. 593-600.; J. Shi, C. Tomasi, Good features to track, Proceedings of CVPR94, 1994, pp. 593-600.
[3] F. Bourel, C. Chibelushi, A. Low, Robust facial feature tracking, Proceedings of 11th British Machine Vision Conference, vol. 1, 2000, pp. 232-241.; F. Bourel, C. Chibelushi, A. Low, Robust facial feature tracking, Proceedings of 11th British Machine Vision Conference, vol. 1, 2000, pp. 232-241.
[4] C. Tomasi, T. Kanade, Detection and tracking of point features, Carnegie Mellon University, Technical Report CMU-CS-91-132.; C. Tomasi, T. Kanade, Detection and tracking of point features, Carnegie Mellon University, Technical Report CMU-CS-91-132.
[5] C. Poelman, The paraperspective and projective factorization method for recovering shape and motion, Carnegie Mellon University, Technical Report CMU-CS-95-173.; C. Poelman, The paraperspective and projective factorization method for recovering shape and motion, Carnegie Mellon University, Technical Report CMU-CS-95-173.
[6] L. Torresani, C. Bregler, Space-time tracking, Proceedings of ECCV02, vol. 1, 2002, pp. 801-812.; L. Torresani, C. Bregler, Space-time tracking, Proceedings of ECCV02, vol. 1, 2002, pp. 801-812. · Zbl 1034.68685
[7] Z. Zhu, Q. Ji, K. Fujimura, K. Lee, Combining Kalman filtering and mean shift for real time eye tracking under active IR illumination, Proceedings of ICPR02, vol. 4, 2002, pp. 318-321.; Z. Zhu, Q. Ji, K. Fujimura, K. Lee, Combining Kalman filtering and mean shift for real time eye tracking under active IR illumination, Proceedings of ICPR02, vol. 4, 2002, pp. 318-321.
[8] Kass, M.; WItkin, A.; Terzopoulos, D., Snakes: active contour models, Int. J. Comput. Vision, 1, 4, 321-331 (1988)
[9] Yuille, A.; Haallinan, P.; Cohen, D. S., Feature extraction from faces using deformable templates, Int. J. Comput. Vision, 8, 2, 99-111 (1992)
[10] Cootes, T. F.; Taylor, C. J.; Cooper, D. H.; Graham, J., Active shape models—their training and application, Comput. Vision Image Understanding, 61, 1, 38-59 (1995)
[11] Cootes, T. F.; Edwards, G. J.; Taylor, C., Active appearance models, IEEE Trans. Pattern Anal. Mach. Intell., 23, 6, 681-685 (2001)
[12] X.W. Hou, S.Z. Li, H.J. Zhang, Q.S. Cheng, Direct appearance models, Proceedings of CVPR01, vol. 1, 2001, pp. 828-833.; X.W. Hou, S.Z. Li, H.J. Zhang, Q.S. Cheng, Direct appearance models, Proceedings of CVPR01, vol. 1, 2001, pp. 828-833.
[13] Wiskott, L.; Fellous, J. M.; Krüger, N.; der Malsburg, C. V., Face recognition by elastic bunch graph matching, IEEE Trans. Pattern Anal. Mach. Intell., 19, 7, 775-779 (1997)
[14] Jones, M. J.; Poggio, T., Multi-dimensional morphable models: a framework for representing and matching object classes, Int. J. Comput. Vision, 29, 107-131 (1998)
[15] Sclaroff, S.; Isidoro, J., Active blobs: region-based, deformable appearance models, Comput. Vision Image Understanding, 89, 2-3, 197-225 (2003) · Zbl 1055.68099
[16] S.J. McKenna, S. Gong, R.P. Würtz, J. Tanner, D. Banin, Tracking facial feature points with Gabor wavelets and shape models, Proceedings of International Conference on Audio- and Video-based Biometric Person Authentication, 1997, pp. 35-42.; S.J. McKenna, S. Gong, R.P. Würtz, J. Tanner, D. Banin, Tracking facial feature points with Gabor wavelets and shape models, Proceedings of International Conference on Audio- and Video-based Biometric Person Authentication, 1997, pp. 35-42.
[17] M. Rogers, J. Graham, Robust active shape model search, Proceedings of ECCV, vol. 4, 2002, pp. 517-530.; M. Rogers, J. Graham, Robust active shape model search, Proceedings of ECCV, vol. 4, 2002, pp. 517-530. · Zbl 1039.68709
[18] Cootes, T. F.; Wheeler, G. V.; Walker, K. N.; Taylor, C. J., View-based active appearance models, Image Vision Comput., 20, 9, 657-664 (2002)
[19] T. Heap, D. Hogg, Wormholes in shape space: tracking through discontinuous changes in shape, Proceedings of ICCV98, 1998, pp. 344-349.; T. Heap, D. Hogg, Wormholes in shape space: tracking through discontinuous changes in shape, Proceedings of ICCV98, 1998, pp. 344-349.
[20] Cootes, T. F.; Taylor, C. J., A mixture model for representing shape variation, Image Vision Comput., 17, 8, 567-573 (1999)
[21] C.M. Christoudias, T. Darrell, On modelling nonlinear shape-and-texture appearance manifolds, Proceedings of CVPR05, vol. 2, 2005, pp. 1067-1074.; C.M. Christoudias, T. Darrell, On modelling nonlinear shape-and-texture appearance manifolds, Proceedings of CVPR05, vol. 2, 2005, pp. 1067-1074.
[22] Y. Li, S. Gong, H. Liddell, Modelling faces dynamically across views and over time, Proceedings of ICCV01, vol. 1, 2001, pp. 554-559.; Y. Li, S. Gong, H. Liddell, Modelling faces dynamically across views and over time, Proceedings of ICCV01, vol. 1, 2001, pp. 554-559.
[23] V. Blanz, T. Vetter, A morphable model for the synthesis of 3D faces, Siggraph 1999, Computer Graphics Proceedings, 1999, pp. 187-194.; V. Blanz, T. Vetter, A morphable model for the synthesis of 3D faces, Siggraph 1999, Computer Graphics Proceedings, 1999, pp. 187-194.
[24] J. Xiao, S. Baker, I. Matthews, T. Kanade, Real-time combined \(2 \operatorname{D} + 3 \operatorname{D} \); J. Xiao, S. Baker, I. Matthews, T. Kanade, Real-time combined \(2 \operatorname{D} + 3 \operatorname{D} \)
[25] Sozou, P. D.; Cootes, T. F.; Taylor, C. J.; di Mauro, E., Non-linear generalization of point distribution models using polynomial regression, Image Vision Comput., 13, 5, 451-457 (1995)
[26] S. Romdhani, S. Gong, A. Psarrou, Multi-view nonlinear active shape model using kernel PCA, Proceedings of BMVC, 1999, pp. 483-492.; S. Romdhani, S. Gong, A. Psarrou, Multi-view nonlinear active shape model using kernel PCA, Proceedings of BMVC, 1999, pp. 483-492.
[27] Yan, S.; Hou, X.; Li, S. Z.; Zhang, H.; Cheng, Q., Face alignment using view-based direct appearance models, Special issue on facial image processing, analysis and synthesis, Int. J. Imaging Syst. Technol., 13, 1, 106-112 (2003)
[28] K. Grauman, T. Darrell, Fast contour matching using approximate earth movers’s distance, Proceedings of CVPR04, vol. 1, 2004, pp. 220-227.; K. Grauman, T. Darrell, Fast contour matching using approximate earth movers’s distance, Proceedings of CVPR04, vol. 1, 2004, pp. 220-227.
[29] Tian, Y.; Kanade, T.; Cohn, J. F., Recognizing action units for facial expression analysis, IEEE Trans. on Pattern Anal. Mach. Intell., 23, 2, 97-115 (2001)
[30] A. Yilmaz, K. Shafique, M. Shah, Estimation of rigid and non-rigid facial motion using anatomical face model, Proceedings of ICPR02, vol. 1, 2002, pp. 377-380.; A. Yilmaz, K. Shafique, M. Shah, Estimation of rigid and non-rigid facial motion using anatomical face model, Proceedings of ICPR02, vol. 1, 2002, pp. 377-380.
[31] Tao, H.; Huang, T. S., Visual estimation and compression of facial motion parameters: elements of a 3D model-based video coding system, Int. J. Comput. Vision, 50, 2, 111-125 (2002) · Zbl 1012.68780
[32] Goldenstein, S. K.; Vogler, C.; Metaxas, D., Statistical cue integration in DAG deformable models, IEEE Trans. Pattern Anal. Mach. Intell., 25, 7, 801-813 (2003)
[33] Xiao, J.; Chai, J.; Kanade, T., A closed-form solution to non-rigid shape and motion recovery, Int. J. Comput. Vision, 67, 2, 233-246 (2006) · Zbl 1477.68438
[34] Dryden, I. L.; Mardia, K. V., Statistical Shape Analysis (1998), Wiley: Wiley Chichester · Zbl 0901.62072
[35] Or, S. H.; Luk, W. S.; Wong, K. H.; King, I., An efficient iterative pose estimation algorithm, Image Vision Comput., 16, 5, 353-362 (1998)
[36] Fischler, M. A.; Bolles, R. C., Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Commun. ACM, 24, 6, 381-395 (1981)
[37] Trucco, E.; Verri, A., Introductory Techniques for 3-D Computer Vision (1998), Prentice-Hall: Prentice-Hall Englewood Cliffs
[38] Y. Wang, T. Tan, K.-F. Loe, Joint region tracking with switching hypothesized measurements, Proceedings of ICCV03, vol. 1, 2003, pp. 75-82.; Y. Wang, T. Tan, K.-F. Loe, Joint region tracking with switching hypothesized measurements, Proceedings of ICCV03, vol. 1, 2003, pp. 75-82.
[39] H. Gu, Q. Ji, Z. Zhu, Active facial tracking for fatigue detection, Proceedings of Sixth IEEE Workshop on Applications of Computer Vision, 2002, pp. 137-142.; H. Gu, Q. Ji, Z. Zhu, Active facial tracking for fatigue detection, Proceedings of Sixth IEEE Workshop on Applications of Computer Vision, 2002, pp. 137-142.
[40] Daugman, J., Complete discrete 2D Gabor transforms by neural networks for image analysis and compression, IEEE Trans. ASSP, 36, 7, 1169-1179 (1988) · Zbl 0709.94577
[41] Z. Zhang, M. Lyons, M. Schuster, S. Akamatsu, Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron, Proceedings of FGR98, 1998, pp. 454-459.; Z. Zhang, M. Lyons, M. Schuster, S. Akamatsu, Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron, Proceedings of FGR98, 1998, pp. 454-459.
[42] Y. Tian, T. Kanade, J.F. Cohn, Evaluation of Gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity, Proceedings of FGR02, 2002, pp. 218-223.; Y. Tian, T. Kanade, J.F. Cohn, Evaluation of Gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity, Proceedings of FGR02, 2002, pp. 218-223.
[43] F. Jiao, S.Z. Li, H.Y. Shum, D. Schuurmans, Face alignment using statistical models and wavelet features, Proceedings of CVPR03, vol. 1, 2003, pp. 321-327.; F. Jiao, S.Z. Li, H.Y. Shum, D. Schuurmans, Face alignment using statistical models and wavelet features, Proceedings of CVPR03, vol. 1, 2003, pp. 321-327.
[44] Fleet, D. J.; Jepson, A. D., Computation of component image velocity from local phase information, Int. J. Comput. Vision, 5, 1, 77-104 (1990)
[45] Theimer, W. M., Phase-based binocular vergence control and depth reconstruction using active vision, CVGIP: Image Understanding, 60, 3, 343-358 (1994)
[46] P. Wang, Q. Ji, Learning discriminant features for multi-view face and eye detection, Proceedings of CVPR05, vol. 1, 2005, pp. 373-379.; P. Wang, Q. Ji, Learning discriminant features for multi-view face and eye detection, Proceedings of CVPR05, vol. 1, 2005, pp. 373-379.
[47] P. Wang, M.B. Green, Q. Ji, J. Wayman, Automatic eye detection and its validation, IEEE Workshop on Face Recognition Grand Challenge Experiments (with CVPR05), vol. 3, 2005, pp. 164-164.; P. Wang, M.B. Green, Q. Ji, J. Wayman, Automatic eye detection and its validation, IEEE Workshop on Face Recognition Grand Challenge Experiments (with CVPR05), vol. 3, 2005, pp. 164-164.
[48] Viola, P.; Jones, M., Robust real-time object detection, Int. J. Comput. Vision, 57, 2, 137-154 (2004)
[49] P.J. Phillips, P.J. Flynn, T. Scruggs, K.W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, W. Worek, Overview of the face recognition grand challenge, Proceedings of CVPR05, vol. 1, 2005, pp. 947-954.; P.J. Phillips, P.J. Flynn, T. Scruggs, K.W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, W. Worek, Overview of the face recognition grand challenge, Proceedings of CVPR05, vol. 1, 2005, pp. 947-954.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.