Joint dimensionality reduction and metric learning for image set classification.

*(English)*Zbl 1457.68243Summary: Compared with the traditional classification task based on a single image, an image set contains more complementary information, which is of great benefit to correctly classify a query subject. Thus, image set classification has attracted much attention from researchers. However, the main challenge is how to effectively represent an image set to fully exploit the latent discriminative feature. Unlike in previous works where an image set was represented by a single or a hybrid mode, in this paper, we propose a novel multi-model fusion method across the Euclidean space to the Riemannian manifold to jointly accomplish dimensionality reduction and metric learning. To achieve the goal of our framework, we first introduce three distance metric learning models, namely, Euclidean-Euclidean, Riemannian-Riemannian and Euclidean-Riemannian to better exploit the complementary information of an image set. Then, we aim to simultaneously learn two mappings performing dimensionality reduction and a metric matrix by integrating the two heterogeneous spaces (i.e., the Euclidean space and the Riemannian manifold space) into the common induced Mahalanobis space in which the within-class data sets are close and the between-class data sets are separated. This strategy can effectively handle the severe drawback of not considering the distance metric learning when performing dimensionality reduction in the existing set based methods. Furthermore, to learn a complete Mahalanobis metric, we adopt the \(L_{2,1}\) regularized metric matrix for optimal feature selection and classification. The results of extensive experiments on face recognition, object classification, gesture recognition and handwritten classification demonstrated well the effectiveness of the proposed method compared with other image set based algorithms.

##### MSC:

68T05 | Learning and adaptive systems in artificial intelligence |

62H30 | Classification and discrimination; cluster analysis (statistical aspects) |

62H35 | Image analysis in multivariate analysis |

##### Keywords:

image set classification; feature learning; kernel; dimensionality reduction; metric learning; heterogeneous space fusion
Full Text:
DOI

##### References:

[1] | Lyons, M. J.; Budynek, J.; Akamatsu, S., Automatic classification of single facial images, IEEE Trans. Pattern Anal. Mach. Intell., 21, 12, 1357-1362 (1999) |

[2] | Korytkowski, M.; Rutkowski, L.; Scherer, R., Fast image classification by boosting fuzzy classifiers, Inf. Sci., 327, 175-182 (2016) |

[3] | Zhang, C.; Cheng, J.; Zhang, Y.; Liu, J.; Liang, C.; Pang, J.; Huang, Q.; Tian, Q., Image classification using boosted local features with random orientation and location selection, Inf. Sci., 310, 118-129 (2015) |

[4] | Cevikalp, H.; Triggs, B., Face recognition based on image sets, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, 2567-2573 (2010), IEEE |

[5] | Wang, R.; Shan, S.; Chen, X.; Gao, W., Manifold-manifold distance with application to face recognition based on image set, Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, 1-8 (2008), IEEE |

[6] | Chen, L., Dual linear regression based classification for face cluster recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2673-2680 (2014) |

[7] | Wang, W.; Wang, R.; Huang, Z.; Shan, S.; Chen, X., Discriminant analysis on Riemannian manifold of gaussian distributions for face recognition with image sets, IEEE Trans. Image Process., 27, 1, 151-163 (2018) · Zbl 1409.94630 |

[8] | Zheng, P.; Zhao, Z.-Q.; Gao, J.; Wu, X., A set-level joint sparse representation for image set classification, Inf. Sci., 448, 75-90 (2018) |

[9] | Harandi, M.; Salzmann, M.; Baktashmotlagh, M., Beyond Gauss: image-set matching on the Riemannian manifold of PDFs, Proceedings of the IEEE International Conference on Computer Vision, 4112-4120 (2015) |

[10] | Yamaguchi, O.; Fukui, K.; Maeda, K.-i., Face recognition using temporal image sequence, Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on, 318-323 (1998), IEEE |

[11] | OJE, E., Subspace methods of pattern recognition, Pattern Recognition and Image Processing series, vol. 6 (1983), Research Studies Press |

[12] | Kim, T.-K.; Kittler, J.; Cipolla, R., Discriminative learning and recognition of image set classes using canonical correlations, IEEE Trans. Pattern Anal. Mach. Intell., 29, 6, 1005-1018 (2007) |

[13] | Wang, R.; Chen, X., Manifold discriminant analysis, Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 429-436 (2009), IEEE |

[14] | Hu, Y.; Mian, A. S.; Owens, R., Sparse approximated nearest points for image set classification, Computer vision and pattern recognition (CVPR), 2011 IEEE conference on, 121-128 (2011), IEEE |

[15] | Yang, M.; Zhu, P.; Van Gool, L.; Zhang, L., Face recognition based on regularized nearest points between image sets, Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, 1-7 (2013), IEEE |

[16] | Shakhnarovich, G.; Fisher, J. W.; Darrell, T., Face recognition from long-term observations, European Conference on Computer Vision, 851-865 (2002), Springer · Zbl 1039.68719 |

[17] | Arandjelovic, O.; Shakhnarovich, G.; Fisher, J.; Cipolla, R.; Darrell, T., Face recognition with image sets using manifold density divergence, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, 581-588 (2005), IEEE |

[18] | Zhang, M.; He, R.; Cao, D.; Sun, Z.; Tan, T., Simultaneous feature and sample reduction for image-set classification., AAAI, vol. 16, 1401-1407 (2016) |

[19] | Shah, S. A.; Nadeem, U.; Bennamoun, M.; Sohel, F.; Togneri, R., Efficient image set classification using linear regression based image reconstruction, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 99-108 (2017) |

[20] | Shah, S. A.A.; Bennamoun, M.; Boussaid, F., Iterative deep learning for image set based face and object recognition, Neurocomputing, 174, 866-874 (2016) |

[21] | Huang, Z.; Wang, R.; Shan, S.; Chen, X., Face recognition on large-scale video in the wild with hybrid Euclidean-and-Riemannian metric learning, Pattern Recognit., 48, 10, 3113-3124 (2015) |

[22] | Lu, J.; Wang, G.; Moulin, P., Localized multifeature metric learning for image-set-based face recognition, IEEE Trans. Circuits Syst. Video Technol., 26, 3, 529-540 (2016) |

[23] | Huang, Z.; Wang, R.; Shan, S.; Van Gool, L.; Chen, X., Cross Euclidean-to-Riemannian metric learning with application to face recognition from video, IEEE Trans. Pattern Anal. Mach. Intell., 40, 12, 2827-2840 (2018) |

[24] | Gao, X.; Sun, Q.; Xu, H.; Wei, D.; Gao, J., Multi-model fusion metric learning for image set classification, Knowl. Based Syst., 164, 253-264 (2019) |

[25] | Wu, Y.; Jia, Y.; Li, P.; Zhang, J.; Yuan, J., Manifold kernel sparse representation of symmetric positive-definite matrices and its applications, IEEE Trans. Image Process., 24, 11, 3729-3741 (2015) · Zbl 1408.94722 |

[26] | Feng, G.; Li, H.; Dong, J.; Zhang, J., Face recognition based on Volterra kernels direct discriminant analysis and effective feature classification, Inf. Sci., 441, 187-197 (2018) |

[27] | Zheng, P.; Zhao, Z.-Q.; Gao, J.; Wu, X., Image set classification based on cooperative sparse representation, Pattern Recognit., 63, 206-217 (2017) |

[28] | Huang, Z.; Wang, R.; Shan, S.; Chen, X., Projection metric learning on Grassmann manifold with application to video based face recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 140-149 (2015) |

[29] | S. Liao, Y. Hu, S.Z. Li, Joint dimension reduction and metric learning for person re-identification, arXiv:1406.4216 (2014). |

[30] | Harandi, M.; Salzmann, M.; Hartley, R., Joint dimensionality reduction and metric learning: a geometric take, Proceedings of the 34th International Conference on Machine Learning-Volume 70, 1404-1413 (2017), JMLR. org |

[31] | Hotelling, H., Relations between two sets of variates, Breakthroughs in statistics, 162-190 (1992), Springer |

[32] | Wang, W.; Wang, R.; Shan, S.; Chen, X., Prototype discriminative learning for face image set classification, Asian Conference on Computer Vision, 344-360 (2016), Springer |

[33] | Naseem, I.; Togneri, R.; Bennamoun, M., Linear regression for face recognition, IEEE Trans. Pattern Anal. Mach. Intell., 32, 11, 2106-2112 (2010) |

[34] | Feng, Q.; Zhou, Y.; Lan, R., Pairwise linear regression classification for image set retrieval, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4865-4872 (2016) |

[35] | Chen, S.; Sanderson, C.; Harandi, M. T.; Lovell, B. C., Improved image set classification via joint sparse approximated nearest subspaces, Proceedings of the IEEE Conference on Computer Vision and pattern Recognition, 452-459 (2013) |

[36] | Hu, H., Sparse discriminative multimanifold Grassmannian analysis for face recognition with image sets, IEEE Trans. Circuits Syst. Video Technol., 25, 10, 1599-1611 (2015) |

[37] | Huang, Z.; Wang, R.; Shan, S.; Li, X.; Chen, X., Log-euclidean metric learning on symmetric positive definite manifold with application to image set classification, International Conference on Machine Learning, 720-729 (2015) |

[38] | Wang, R.; Guo, H.; Davis, L. S.; Dai, Q., Covariance discriminative learning: a natural and efficient approach to image set classification, Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2496-2503 (2012), IEEE |

[39] | Hayat, M.; Bennamoun, M.; An, S., Deep reconstruction models for image set classification, IEEE Trans. Pattern Anal. Mach. Intell., 37, 4, 713-727 (2015) |

[40] | Lu, J.; Wang, G.; Deng, W.; Moulin, P.; Zhou, J., Multi-manifold deep metric learning for image set classification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1137-1145 (2015) |

[41] | Nie, F.; Huang, H.; Cai, X.; Ding, C. H., Efficient and robust feature selection via joint l21-norms minimization, Advances in Neural Information Processing Systems, 1813-1821 (2010) |

[42] | Kumar, M. A.; Gopal, M., Least squares twin support vector machines for pattern classification, Expert Syst. Appl., 36, 4, 7535-7543 (2009) |

[43] | Kim, M.; Kumar, S.; Pavlovic, V.; Rowley, H., Face tracking and recognition with visual constraints in real-world videos, Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, 1-8 (2008), IEEE |

[44] | Georghiades, A. S.; Belhumeur, P. N.; Kriegman, D. J., From few to many: illumination cone models for face recognition under variable lighting and pose, IEEE Trans. Pattern Anal. Mach. Intell., 23, 6, 643-660 (2001) |

[45] | Huang, Z.; Shan, S.; Wang, R.; Zhang, H.; Lao, S.; Kuerban, A.; Chen, X., A benchmark and comparative study of video-based face recognition on COX face database, IEEE Trans. Image Process., 24, 12, 5967-5981 (2015) · Zbl 1408.94265 |

[46] | Leibe, B.; Schiele, B., Analyzing appearance and contour based methods for object categorization, Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, vol. 2, II-409 (2003), IEEE |

[47] | Kim, T.-K.; Cipolla, R., Canonical correlation analysis of video volume tensors for action categorization and detection, IEEE Trans. Pattern Anal. Mach. Intell., 31, 8, 1415-1428 (2009) |

[48] | Deng, L., The MNIST database of handwritten digit images for machine learning research [best of the web], IEEE Signal Process. Mag., 29, 6, 141-142 (2012) |

[49] | Viola, P.; Jones, M. J., Robust real-time face detection, Int. J. Comput. Vis., 57, 2, 137-154 (2004) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.