×

Sparse representation based Fisher discrimination dictionary learning for image classification. (English) Zbl 1328.68189

Summary: The employed dictionary plays an important role in sparse representation or sparse coding based image reconstruction and classification, while learning dictionaries from the training data has led to state-of-the-art results in image classification tasks. However, many dictionary learning models exploit only the discriminative information in either the representation coefficients or the representation residual, which limits their performance. In this paper we present a novel dictionary learning method based on the Fisher discrimination criterion. A structured dictionary, whose atoms have correspondences to the subject class labels, is learned, with which not only the representation residual can be used to distinguish different classes, but also the representation coefficients have small within-class scatter and big betweenclass scatter. The classification scheme associated with the proposed Fisher discrimination dictionary learning (FDDL) model is consequently presented by exploiting the discriminative information in both the representation residual and the representation coefficients. The proposed FDDL model is extensively evaluated on various image datasets, and it shows superior performance to many state-of-the-art dictionary learning methods in a variety of classification tasks.

MSC:

68T10 Pattern recognition, speech recognition
68T05 Learning and adaptive systems in artificial intelligence
68U10 Computing methodologies for image processing

Software:

FRGC; Multi-PIE; AR face
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Aharon, M; Elad, M; Bruckstein, A, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation, IEEE Transactions on Signal Processing, 54, 4311-4322, (2006) · Zbl 1375.94040
[2] Beck, A; Teboulle, M, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences, 2, 183-202, (2009) · Zbl 1175.94009
[3] Bengio, S., Pereira, F., Singer, Y., & Strelow, D. (2009). Group sparse coding. In Proceedings of the Neural Information Processing Systems
[4] Bobin, J; Starck, J; Fadili, J; Moudden, Y; Donoho, D, Morphological component analysis: an adaptive thresholding strategy, IEEE Transactions on Image Processing, 16, 2675-2681, (2007) · Zbl 1288.94009
[5] Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge university press. · Zbl 1058.90049
[6] Bryt, O; Elad, M, Compression of facial images using the K-SVD algorithm, Journal of Visual Communication and Image Representation, 19, 270-282, (2008)
[7] Candes, E, Compressive sampling, International Congress of Mathematicians, 3, 1433-1452, (2006) · Zbl 1130.94013
[8] Castrodad, A; Sapiro, G, Sparse modeling of human actions from motion imagery, International Journal of Computer Vision, 100, 1-15, (2012)
[9] Cooley, JW; Tukey, JW, An algorithm for the machine calculation of complex Fourier series, Mathematics of Computation, 19, 297-301, (1965) · Zbl 0127.09002
[10] Deng, WH; Hu, JN; Guo, J, Extended SRC: undersampled face recognition via intraclass variation dictionary, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 1864-1870, (2012)
[11] Duda, R., Hart, P., & Stork, D. (2000). Pattern classification (2nd ed.). New York: Wiley-Interscience. · Zbl 0968.68140
[12] Elad, M; Aharon, M, Image denoising via sparse and redundant representations over learned dictionaries, IEEE Transactions on Image Processing, 15, 3736-3745, (2006)
[13] Engan, K., Aase, S. O., & Husoy, J. H. (1999). Method of optimal directions for frame design. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing · Zbl 1098.94538
[14] Fernando, B., Fromont, E., & Tuytelaars, T. (2012). Effective use of frequent itemset mining for image classification. In: Proceedings of the European Conference Computer Vision · Zbl 1069.62054
[15] Gehler, P., & Nowozin, S. (2009). On feature combination for multiclass object classification. In: Proceedings of the International Conference Computer Vision
[16] Georghiades, A; Belhumeur, P; Kriegman, D, From few to many: illumination cone models for face recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 643-660, (2001)
[17] Gross, R; Matthews, I; Cohn, J; Kanade, T; Baker, S, Multi-PIE, Image and Vision Computing, 28, 807-813, (2010)
[18] Guha, T; Ward, RK, Learning sparse representations for human action recognition, IEEE Transactions on Pattern Analysis and Machine Learning, 34, 1576-1888, (2012)
[19] Guo, Y; Li, S; Yang, J; Shu, T; Wu, L, A generalized Foley-Sammon transform based on generalized Fisher discrimination criterion and its application to face recognition, Pattern Recognition Letter, 24, 147-158, (2003) · Zbl 1055.68092
[20] Hoyer, P. O. (2002). Non-negative sparse coding. In: Proceedings of the IEEE Workshop Neural Networks for Signal Processing
[21] Huang, K., & Aviyente, S. (2006). Sparse representation for signal classification. In: Proceedings of the Neural Information and Processing Systems · Zbl 1172.94422
[22] Hull, JJ, A database for handwritten text recognition research, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 550-554, (1994)
[23] Jenatton, R; Mairal, J; Obozinski, G; Bach, F, Proximal methods for hierarchical sparse coding, Journal of Machine Learning Research, 12, 2234-2297, (2011) · Zbl 1280.94029
[24] Jia, YQ; Nie, FP; Zhang, CS, Trace ratio problem revisited, IEEE Transactions on Neural Network, 20, 729-735, (2009)
[25] Jiang, ZL; Lin, Z; Davis, LS, Abel consistent K-SVD: learning a discriminative dictionary for recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 533, (2013)
[26] Jiang, Z. L., Zhang, G. X., & Davis, L. S. (2012). Submodular dictionary learning for sparse coding. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
[27] Kim, SJ; Koh, K; Lustig, M; Boyd, S; Gorinevsky, D, A interior-point method for large-scale \(l_{1}\)-regularized least squares, IEEE Journal on Selected Topics in Signal Processing, 1, 606-617, (2007)
[28] Kong, S., & Wang, D. H. (2012). A dictionary learning approach for classification: Separating the particularity and the commonality. In: Proceedings of the European Conference on Computer Vision.
[29] Li, H; Jiang, T; Zhang, K, Efficient and robust feature extraction by maximum margin criterion, IEEE Transactions on Neural Network, 17, 157-165, (2006)
[30] Lian, X. C., Li, Z. W., Lu, B. L., & Zhang, L. (2010). Max-Margin Dictionary Learning for Multi-class Image Categorization. In: Proceedings of the European Conference on Computer Vision
[31] Mairal, J; Bach, F; Ponce, J, Task-driven dictionary learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 791-804, (2012)
[32] Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zissserman, A. (2008b). Learning discriminative dictionaries for local image analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[33] Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2009). Supervised dictionary learning. In: Proceedings of the Neural Information and Processing Systems
[34] Mairal, J; Elad, M; Sapiro, G, Sparse representation for color image restoration, IEEE Transactions on Image Processing, 17, 53-69, (2008)
[35] Mairal, J., Leordeanu, M., Bach, F., Hebert, M., & Ponce, J. (2008c). Discriminative sparse image models for class-specific edge detection and image interpretation. In: Proceedings of the European Conference on Computer Vision · Zbl 1055.68092
[36] Mallat, S. (1999). A wavelet tour of signal processing (2nd ed.). San Diego: Academic Press. · Zbl 0998.94510
[37] Martinez, A., & Benavente, R. (1998). The AR face database (p. 24). Report No: CVC Tech.
[38] Nesterov, Y., & Nemirovskii, A. (1994). Interior-point polynomial algorithms in convex programming. Philadelphia: SIAM. · Zbl 0824.90112
[39] Nilsback, M., & Zisserman, A. (2006). A visual vocabulary for flower classification. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
[40] Okatani, T; Deguchi, K, On the wiberg algorithm for matrix factorization in the presence of missing components, Internationall Journal of Computer Vision, 72, 329-337, (2007)
[41] Oliva, A; Torralba, A, Modeling the shape of the scene: A holistic representation of the spatial envelope, International Journal of Computer Vision, 42, 145-174, (2001) · Zbl 0990.68601
[42] Olshausen, BA; Field, DJ, Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Nature, 381, 607-609, (1996)
[43] Olshausen, BA; Field, DJ, Sparse coding with an overcomplete basis set: A strategy employed by v1?, Vision Research, 37, 3311-3325, (1997)
[44] Pham, D., & Venkatesh, S. (2008). Joint learning and dictionary construction for pattern recognition. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
[45] Phillips, P. J., Flynn, P. J., Scruggs, W. T., Bowyer, K. W., Chang, J., Hoffman, K., et al. (2005). Overiew of the face recognition grand challenge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[46] Qiu, Q., Jiang, Z. L., & Chellappa, R. (2011). Sparse dictionary-based representation and recognition of action attributes. In: Proceedings of the International Conference on Computer Vision
[47] Ramirez, I., Sprechmann, P., & Sapiro, G. (2010). Classification and clustering via dictionary learning with structured incoherence and shared features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[48] Rodriguez, F., & Sapiro, G. (2007). Sparse representation for image classification: Learning discriminative and reconstructive non-parametric dictionaries (p. 2213). Preprint: IMA.
[49] Rodriguez, M., Ahmed, J., & Shah, M. (2008). A spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[50] Rosasco, L., Verri, A., Santoro, M., Mosci, S., & Villa, S. (2009). Iterative Projection Methods for Structured Sparsity Regularization. MIT Technical Reports, MIT-CSAIL-TR-2009-050, CBCL-282. · Zbl 1317.68183
[51] Rubinstein, R; Bruckstein, AM; Elad, M, Dictionaries for sparse representation modeling, Proceedings of the IEEE, 98, 1045-1057, (2010)
[52] Sadanand, S., & Corso, J. J. (2012). Action bank: A high-level representation of activeity in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition · Zbl 1288.94009
[53] Shen, L., Wang, S. H., Sun, G., Jiang, S. Q., & Huang, Q. M. (2013). Multi-level discriminative dictionary learning towards hierarchical visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[54] Song, FX; Zhang, D; Mei, DY; Guo, ZW, A multiple maximum scatter difference discriminant criterion for facial feature extraction, IEEE Transactions on Systems, Man, and Cybernetics Part B, 37, 1599-1606, (2007)
[55] Sprechmann, P., & Sapiro, G. (2010). Dictionary learning and sparse coding for unsupervised clustering. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing
[56] Szabo, Z., Poczos, B., & Lorincz, A. (2011). Online group-structured dictionary learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[57] Tropp, JA; Wright, SJ, Computational methods for sparse solution of linear inverse problems, Proceedings of the IEEE Conference Special Issue on Applications of Compressive Representation, 98, 948-958, (2010)
[58] Turk, M; Pentland, A, Eigenfaces for recognition, Journal of Cognitive Neuroscience, 3, 71-86, (1991)
[59] Viola, P; Jones, MJ, Robust real-time face detection, International Journal of Computer Vision, 57, 137-154, (2004)
[60] Wagner, A; Wright, J; Ganesh, A; Zhou, ZH; Mobahi, H; Ma, Y, Toward a practical face recognition system: robust alignment and illumination by sparse representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 373-386, (2012)
[61] Wang, H., Ullah, M., Klaser, A., Laptev, I., & Schmid C. (2009). Evaluation of local spatio-temporal features for actions recognition. In: Proceedings of the British Machine Vision Conference.
[62] Wang, H., Yan, S.C., Xu, D., Tang, X.O., & Huang, T. (2007). Trace ratio versus ratio trace for dimensionality reduction. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition.
[63] Wang, HR; Yuan, CF; Hu, WM; Sun, CY, Supervised class-specific dictionary learning for sparse modeling in action recognition, Pattern Recognition, 45, 3902-3911, (2012)
[64] Wright, J; Yang, AY; Ganesh, A; Sastry, SS; Ma, Y, Robust face recognition via sparse representation, IEEE Trans Pattern Analysis and Machine Intelligence, 31, 210-227, (2009)
[65] Wright, JS; Nowak, DR; Figueiredo, TAM, Sparse reconstruction by separable approximation, IEEE Transactions on Signal Processing, 57, 2479-2493, (2009) · Zbl 1391.94442
[66] Wu, YN; Si, ZZ; Gong, HF; Zhu, SC, Learning active basis model for object detection and recognition, International Journal of Computer Vision, 90, 198-235, (2010)
[67] Xie, N., Ling, H., Hu, W., & Zhang, X. (2010). Use bin-ratio information for category and scene classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition · Zbl 1375.94040
[68] Yang, A.Y., Ganesh, A., Zhou, Z. H., Sastry, S. S., & Ma, Y. (2010a). A review of fast \(l_{1}\)-minimization algorithms for robust face recognition. arXiv:1007.3753v2. · Zbl 1280.94029
[69] Yang, J. C., Wright, J., Ma, Y., & Huang, T. (2008). Image super-resolution as sparse representation of raw image patches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[70] Yang, J. C., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification.In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition · Zbl 0127.09002
[71] Yang, J. C., Yu, K., & Huang, T. (2010b). Supervised Translation-Invariant Sparse coding. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
[72] Yang, M., & Zhang, L. (2010). Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In: Proceedings of the European Conference on Computer Vision
[73] Yang, M., Zhang, L., Feng, X. C., & Zhang, D. (2011b). Fisher discrimination dictionary learning for sparse representatio. In: Proceedings of the International Conference on Computer Vision
[74] Yang, M., Zhang, L., Yang, J., & Zhang, D. (2010c). Metaface learning for sparse representation based face recognition. In: Proceedings of the IEEE Conference on Image Processing
[75] Yang, M., Zhang, L., Yang, J., & Zhang, D. (2011a). Robust sparse coding for face recognition. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition · Zbl 1373.94455
[76] Yang, M., Zhang, L., & Zhang, D. (2012). Efficient misalignment robust representation for real-time face recognition. In: Proceedings of the European Conference on Computer Vision
[77] Yao, A., Gall, J., & Gool, L. V. (2010). A hough transform-based voting framework for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[78] Ye, G. N., Liu, D., Jhuo, I.-H., & Chang, S.-F. (2012). Robust late fusion with rank minimization. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition
[79] Yu, K., Xu, W., & Gong, Y. (2009). Deep learning with kernel regularization for visual recognition. In: Advances in Neural Information Processing Systems, p. 21.
[80] Yuan, X. T., & Yan, S. C. (2010). Visual classification with multitask joint sparse representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition · Zbl 1381.62217
[81] Zhang, L., Yang, M., & Feng, X. C. (2011). Sparse representation or collaborative representation: which helps face recognition?. In: Proceedings of the International Conference on Computer Vision
[82] Zhang, Q., & Li, B. X. (2010). Discriminative K-SVD for dictionary learning in face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[83] Zhang, Z. D., Ganesh, A., Liang, X., & Ma, Y. (2012). TILT: Transformation invariant low-rank textures. International Journal of Computer Vision, 99, 1-24. · Zbl 1254.68290
[84] Zhou, MY; Chen, HJ; Paisley, J; Ren, L; Li, LB; Xing, ZM; etal., Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images, IEEE Transactions on Image Processing, 21, 130-144, (2012) · Zbl 1373.62334
[85] Zhou, N., & Fan, J. P. (2012). Learning inter-related visual dictionary for object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[86] Zou, H; Hastie, T, Regularization and variable selection via elastic net, Journal of the Royal Statistical Society B, 67, 301-320, (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.