×

Non-negative residual matrix factorization: problem definition, fast solutions, and applications. (English) Zbl 07260309

Summary: Matrix factorization is a very powerful tool to find graph patterns, e.g. communities, anomalies, etc. A recent trend is to improve the usability of the discovered graph patterns, by encoding some interpretation-friendly properties (e.g., non-negativity, sparseness, etc) in the factorization. Most, if not all, of these methods are tailored for the task of community detection. We propose NrMF, a non-negative residual matrix factorization framework, aiming to improve the interpretation for graph anomaly detection. We present two optimization formations and their corresponding optimization solutions. Our method can naturally capture abnormal behaviors on graphs. We further generalize it to admit sparse constrains in the residual matrix. The effectiveness and efficiency of the proposed algorithms are analyzed, showing that our algorithm (i) leads to a local optima; and (ii) scales to large graphs. The experimental results on several data sets validate its effectiveness as well as efficiency.

MSC:

62-XX Statistics
68-XX Computer science

Software:

Colibri; OddBall
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] D. D. Lee and H. S. Seung, Algorithms for non-negative matrix factorization, In NIPS, 2000, 556-562.
[2] P. Drineas, R. Kannan, and M. W. Mahoney, Fast Monte Carlo algorithms for matrices iii: computing a compressed approximate matrix decomposition, SIAM J Comput 36 (2005), 184-206. · Zbl 1111.68149
[3] J. Sun, Y. Xie, H. Zhang, and C. Faloutsos, Less is more: compact matrix decomposition for large sparse graphs, In SDM, 2007.
[4] L. Akoglu, M. McGlohon, and C. Faloutsos, Oddball: spotting anomalies in weighted graphs, In PAKDD (2), 2010, 410-421.
[5] D. H. Chau, S. Pandit, and C. Faloutsos, Detecting fraudulent personalities in networks of online auctioneers, In PKDD, 2006, 103-114.
[6] R. M. Bell, Y. Koren, and C. Volinsky, Modeling relationships at multiple scales to improve accuracy of large recommender systems, In KDD, 2007, 95-104.
[7] A. P. Singh and G. J. Gordon, Relational learning via collective matrix factorization, In KDD, 2008, 650-658.
[8] P. Tseng, Simple polynomial-time algorithm for convex quadratic programming, In Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, 1988.
[9] S.-P. Hong and S. Verma, A note on the strong polynomiality of convex quadratic programming, Math Program 68 (1995), 131-139. · Zbl 0834.90103
[10] J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos, Neighborhood formation and anomaly detection in bipartite graphs, In ICDM, 2005, 418-425.
[11] G. H. Golub and C. F. Van-Loan, Matrix Computations, (2nd ed.), Baltimore, The Johns Hopkins University Press, 1989. · Zbl 0733.65016
[12] D. Achlioptas and F. McSherry, Fast computation of lowrank matrix approximations, J ACM 54(2) (2007), 1-19. · Zbl 1311.94031
[13] K. V. R. Kanth, D. Agrawal, and A. K. Singh, Dimensionality reduction for similarity searching in dynamic databases, In SIGMOD Conference, 1998, 166-176.
[14] P. Indyk, Stable distributions, pseudorandom generators, embeddings and data stream computation, In FOCS, 2000, 189-197.
[15] H. Tong, S. Papadimitriou, J. Sun, P. S. Yu, and C. Faloutsos, Colibri: fast mining of large static and dynamic graphs, In KDD, 2008, 686-694.
[16] D. L. Donoho and V. Stodden, When does non-negative matrix factorization give a correct decomposition into parts? In NIPS, 2003.
[17] R. Kompass, A generalized divergence measure for nonnegative matrix factorization, Neural Comput 19(3) (2007), 780-791. · Zbl 1127.68081
[18] J. Kim and H. Park, Toward faster nonnegative matrix factorization: A new algorithm and comparisons, In ICDM, 2008, 353-362.
[19] C. H. Q. Ding, T. Li, and M. I. Jordan, Convex and seminonnegative matrix factorizations, IEEE Trans Pattern Anal Mach Intell 32(1) (2010), 45-55.
[20] P. O. Hoyer, Non-negative matrix factorization with sparseness constraints, J Mach Learning Res 5 (2004), 1457-1469. · Zbl 1222.68218
[21] S. Hyv¨onen, P. Miettinen, and E. Terzi, Interpretable nonnegativematrixdecompositions,InKDD,2008, 345-353.
[22] F. Wang and P. Li, Efficient nonnegative matrix factorization with random projections, In SDM, 2010, 281-292.
[23] F. Wang, P. Li, and A. C. K¨onig, Efficient document clustering via online nonnegative matrix factorizations, In SDM, 2011, 908-919.
[24] C. C. Noble and D. J. Cook, Graph-based anomaly detection, In KDD, 2003, 631-636.
[25] D. Chakrabarti, Autopart: Parameter-free graph partitioning and outlier detection, In PKDD, 2004, 112-124.
[26] W. Eberle and L. B. Holder, Mining for structural anomalies in graph-based data, In DMIN, 2007, 376-389.
[27] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detection: A survey, ACM Comput Surv 41(3) (2009), 15-58.
[28] R. Albert, H. Jeong, and A.-L. Barabasi, Diameter of the world wide web, Nature 401 (1999), 130-131.
[29] S. Dorogovtsev and J. Mendes. Evolution of networks, Adv Phys 51 (2002), 1079-1187.
[30] M. Faloutsos, P. Faloutsos, and C. Faloutsos, On powerlaw relationships of the Internet topology, In SIGCOMM, August-September 1999, 251-262. · Zbl 0889.68050
[31] A. Broder, R. Kumar, F. Maghoul1, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, Graph structure in the web: Experiments and models, In WWW Conference, 2000.
[32] M. E. J. Newman, The structure and function of complex networks, SIAM Rev 45 (2003), 167-256. · Zbl 1029.68010
[33] D. Xin, J. Han, X. Yan, and H. Cheng, Mining compressed frequent-pattern sets, In VLDB, 2005, 709-720.
[34] D. Kempe, J. Kleinberg, and E. Tardos, Maximizing the spread of influence through a social network, In KDD, 2003. · Zbl 1337.91069
[35] F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan, On compressing social networks, In KDD, 2009, 219-228.
[36] G. Flake, S. Lawrence, C. L. Giles, and F. Coetzee, Selforganization and identification of web communities, IEEE Computer 35(3) (2002), 66-71.
[37] D. Gibson, J. Kleinberg, and P. Raghavan, Inferring web communities from link topology. In 9th ACM Conference on Hypertext and Hypermedia, New York, 1998, 225-234.
[38] M. Girvan and M. E. J. Newman, Community structure is social and biological networks. · Zbl 1032.91716
[39] J. Leskovec, J. M. Kleinberg, and C. Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, In KDD, 2005, 177-187.
[40] L. Backstrom, D. P. Huttenlocher, J. M. Kleinberg, and X. Lan, Group formation in large social networks: Membership, growth, and evolution, In KDD, 2006, 44-54.
[41] H. Tong, S. Papadimitriou, P. S. Yu, and C. Faloutsos, Proximity tracking on time-evolving bipartite graphs, In SDM, 2008, 704-715.
[42] R. Kumar, M. Mahdian, and M. McGlohon, Dynamics of conversations, In KDD, 2010, 553-562.
[43] Y.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.