×

Nonnegative tensor factorization as an alternative Csiszar-Tusnady procedure: algorithms, convergence, probabilistic interpretations and novel probabilistic tensor latent variable analysis algorithms. (English) Zbl 1235.65041

Summary: In this paper we study Nonnegative Tensor Factorization (NTF) based on the Kullback-Leibler (KL) divergence as an alternative Csiszar-Tusnady procedure. We propose new update rules for the aforementioned divergence that are based on multiplicative update rules. The proposed algorithms are built on solid theoretical foundations that guarantee that the limit point of the iterative algorithm corresponds to a stationary solution of the optimization procedure. Moreover, we study the convergence properties of the optimization procedure and we present generalized pythagorean rules. Furthermore, we provide clear probabilistic interpretations of these algorithms. Finally, we discuss the connections between generalized probabilistic tensor latent variable models (PTLVM) and NTF, proposing in that way algorithms for PTLVM for arbitrary multivariate probabilistic mass functions.

MSC:

65F30 Other matrix algorithms (MSC2010)
62H25 Factor analysis and principal components; correspondence analysis
68T05 Learning and adaptive systems in artificial intelligence
60B10 Convergence of probability measures

Software:

FERET; XM2VTSDB
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bro R, Kiers HAL, Andersson CA (1999) PARAFAC2–part II. Modeling chromatographic data with retention time shifts. J Chemom 13: 295–309 · doi:10.1002/(SICI)1099-128X(199905/08)13:3/4<295::AID-CEM547>3.0.CO;2-Y
[2] Carroll J, Chang J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of ”eckart-young” decomposition. Psychometrika 35: 283–319 · Zbl 0202.19101 · doi:10.1007/BF02310791
[3] Cichocki A, Zdunek R, Choi S, Plemmons R, Amari S (2007) Non-negative tensor factorization using alpha and beta divergences. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP07), vol 3, Honolulu, Hawaii, USA, pp 1393–1396
[4] Cichocki A, Zdunek R, Amari S (2008) Nonnegative matrix and tensor factorization. IEEE Signal Process Mag 25(1): 142–145 · Zbl 1172.94390 · doi:10.1109/MSP.2008.4408452
[5] De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4): 1253–1278 · Zbl 0962.15005 · doi:10.1137/S0895479896305696
[6] Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of SIAM international conference on data mining, Philadelphia, pp 606–610
[7] Ding C, Li T, Peng W (2008) On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput Stat Data Anal 52(8): 3913–3927 · Zbl 1452.62053 · doi:10.1016/j.csda.2008.01.011
[8] Donoho D, Stodden V (2004) When does non-negative matrix factorization give a correct decomposition into parts? Adv Neural Inf Process Syst 17
[9] Finesso L, Spreij C (2006) Nonnegative matrix factorization and I-divergence alternative minimization. Linear Algebra Appl 416(2–3): 270–286 · Zbl 1109.15007 · doi:10.1016/j.laa.2005.11.012
[10] Friedlander MP, Hatz K (2006) Computing nonnegative tensor factorizations. Technical Report TR-2006-21, University of British Columbia Department of Computer Science
[11] Gaussier E, Goutte C (2005) Relation between PLSA and NMF and implications. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’05), Salvador, Brazil, pp 601–602
[12] Gonzalez E, Zhang Y (2005) Accelarating the Lee-Seung algorithm for nonnegative matrix factorization. TR-05-02
[13] Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an ”explanatory” multi-modal factor analysis. UCLA working papers in phonetics
[14] Hazan T, Polak S, Shashua A (2005) Sparse image coding using a 3D non-negative tensor factorization. In: Tenth IEEE international conference on computer vision, 2005 (ICCV 2005), vol 1, Beijing, China, pp 50–57
[15] Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the twenty-second annual international SIGIR conference on research and development in information retrieval (SIGIR-99)
[16] Kiers HAL, Berge JMFt, Bro R (1999) PARAFAC2–part I. A direct fitting algorithm for the PARAFAC2 model. J Chemom 13: 275–294 · doi:10.1002/(SICI)1099-128X(199905/08)13:3/4<275::AID-CEM543>3.0.CO;2-B
[17] Kim Y-D, Choi S (2007) Nonnegative tucker decomposition. In: Proceedings of the IEEE CVPR-2007 workshop on component analysis methods
[18] Kim T-K, Cipolla R (2008) Canonical correlation analysis of video volume tensors for action categorization and detection. In: IEEE Trans Pattern Anal Mach Intell (accepted for publication)
[19] Kim Y-D, Cichocki A, Choi S (2008) Nonnegative Tucker decomposition with alpha-divergence. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP-2008)
[20] Kolda TG, Bader BW (2009) Tensor Decompositions and applications. SIAM Rev 51:3
[21] Kotsia I, Zafeiriou S, Pitas I (2007) A novel discriminant nonnegative matrix factorization method with application to facial image characterization problems. IEEE Trans Inf Forensics Secur 2(3): 588–595 · doi:10.1109/TIFS.2007.902017
[22] Kruskal JB (1977) Three way arrays: rank and uniqueness of trilinear decomposition, with application to arithmetic complexity and statistics. Linear Algebra Appl 18: 95–138 · Zbl 0364.15021 · doi:10.1016/0024-3795(77)90069-6
[23] Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401: 788–791 · Zbl 1369.68285 · doi:10.1038/44565
[24] Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: NIPS, pp 556–562
[25] Lin C-J (2007a) On the convergence of multiplicative update for nonnegative matrix factorization. IEEE Trans Neural Netw (accepted for publication)
[26] Lin C-J (2007b) Projected gradients for nonnegative matrix factorization. Neural Comput (accepted for publication) · Zbl 1173.90583
[27] Lu H, Plataniotis KN, Venetsanopoulos AN (2008) MPCA: multilinear principal component analysis of tensor objects. IEEE Trans Neural Netw 18(1): 18–39
[28] Messer K, Matas J, Kittler JV, Luettin J, Maitre G (1999) XM2VTSDB: the extended M2VTS database. In: AVBPA99, Washington, DC, USA, 22–23 March 1999, pp 72–77
[29] Mørup M, Hansen LK, Arnfred SM (2008) Algorithms for sparse nonnegative tucker decomposition. Neural Comput 20(8): 2112–2131 · Zbl 1178.68447 · doi:10.1162/neco.2008.11-06-407
[30] Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 111–126
[31] Pascual-Montano A, Carazo JM, Kochi K, Lehmann D, Pascual-Marqui RD (2006) Nonsmooth nonnegative matrix factorization (nsNMF). IEEE Trans Pattern Anal Mach Intell 28(3): 403–415 · Zbl 05110885 · doi:10.1109/TPAMI.2006.60
[32] Phillips PJ, Moon H, Rauss PJ, Rizvi S (2000) The FERET evaluation methodology for face recognition algorithms. IEEE Trans Pattern Anal Mach Intell 22(10): 1090–1104 · Zbl 05112402 · doi:10.1109/34.879790
[33] Phillips PJ, Wechsler H, Huang J, Rauss P (2009) The FERET database and evaluation procedure for face recognition algorithms. Image Vis Comput 16(5): 295–306 · doi:10.1016/S0262-8856(97)00070-X
[34] Raj RG, Bovik AC (2008) MICA: a multilinear ICA decomposition for natural scene modeling. IEEE Trans Image Process 17(3): 259–271 · Zbl 05516585 · doi:10.1109/TIP.2007.916158
[35] Shashanka MV, Raj B, Smaragdis P (2007) Probabilistic latent variable model for sparse decompositions of non-negative data. Unpublished Draft · Zbl 1173.94383
[36] Shashanka MV, Raj B, Smaragdis P (2008) Probabilistic latent variable models as non-negative factorizations. Comput Intell Neurosci J
[37] Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: International conference of machine learning (ICML)
[38] Shashua A, Zass R, Hazan T (2006) Multi-way clustering using super-symmetric non-negative tensor factorization. In: Proceedings of the European conference on computer vision (ECCV)
[39] Shiryaev AN (1996) Probability, 2nd edn. Springer, Berlin · Zbl 0909.01009
[40] Sidiropoulos ND, Bro R (2000) On the uniqueness of multilinear decomposition of N-way arrays. J Chemom 37: 229–239 · doi:10.1002/1099-128X(200005/06)14:3<229::AID-CEM587>3.0.CO;2-N
[41] Smaragdis P, Raj B (2007) Shift-invariant probabilistic latent component analysis. Mitsubishi Electric Research Laboratories TR2007-009
[42] Smilde A, Bro R, Geladi P (2004) Multi-way analysis: applications in the chemical sciences. Wiley, New York
[43] Sra S, Dhillon IS (2006) Nonnegative Matrix approximation: algorithms and applications. University of Texas at Austin Department of Computer Science Technical Report TR-06-27
[44] Tao D, Li X, Wu X, Maybank SJ (2007) General tensor discriminant analysis and gabor features for gait recognition. IEEE Trans Pattern Anal Mach Intell 10(29): 1700–1715 · Zbl 05340998 · doi:10.1109/TPAMI.2007.1096
[45] Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31: 279–311 · doi:10.1007/BF02289464
[46] Yan S, Xu D, Yang Q, Zhang L, Tang X, Zhang H-J (2007) Multilinear discriminant analysis for face recognition. IEEE Trans Image Process 1(16): 212–220 · Zbl 05453841 · doi:10.1109/TIP.2006.884929
[47] Zafeiriou S (2009a) Discriminant nonnegative tensor factorization algorithms. IEEE Trans Neural Netw 20(2): 217–235 · Zbl 1192.68840 · doi:10.1109/TNN.2008.2005293
[48] Zafeiriou S (2009) Algorithms for nonnegative tensor factorization. In: Ferna’ndez SA, L. Garci’a R, Tao D, Li X (eds) Tensors in image processing and computer vision. Springer, Berlin · Zbl 1192.68840
[49] Zafeiriou S, Tefas A, Buciu I, Pitas I (2006) Exploitining discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neural Netw 14(8): 1063–1073
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.