×

Heterogeneous anomaly detection in social diffusion with discriminative feature discovery. (English) Zbl 1448.91216

Summary: Social diffusion is a dynamic process of information propagation within social networks. In this paper, we study social diffusion from the perspective of discriminative features, a set of features differentiating the behaviors of social network users. We propose a new parameter-free framework based on modeling and interpreting of discriminative features that we have created, named HADISD. It utilizes a probability-distribution-based parameter-free method to identify the maximum vertex set with specified features. Using the maximum vertex set, a probability-distribution-based optimization approach is applied to find the minimum number of vertices in each feature category with the maximum discriminative information. HADISD includes an incremental algorithm to update the discriminative vertex set over time. The proposed model is capable of addressing anomaly detection in social diffusion, and the results can be leveraged for both spammer detection and influence maximization. The findings from our extensive experiments on four real-life datasets show the efficiency and effectiveness of the proposed scheme.

MSC:

91D30 Social networks; opinion dynamics
91-05 Experimental work for problems pertaining to game theory, economics, and finance

Software:

ELKI
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Akoglu, L.; Tong, H.; Vreeken, J.; Faloutsos, C., Fast and reliable anomaly detection in categorical data, CIKM, 415-424 (2012)
[2] Bhagat, S.; Goyal, A.; Lakshmanan, L., Maximizing product adoption in social networks, WSDM, 603-612 (2012)
[3] Brown, J. A.; Lee, J.; Kraev, N., Reputation systems for non-player character interactions based on player actions, Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 1-7 (2017)
[4] Brown, J. A.; Qu, Q., Systems for player reputation with NPC agents, 2015 IEEE Conference on Computational Intelligence and Games, 546-547 (2015)
[5] Buccafurri, F.; Lax, G.; Nocera, A.; Ursino, D., Discovering missing me edges across social networks, Inf. Sci., 319, 18-37 (2015) · Zbl 1390.91261
[6] Chandola, V.; Banerjee, A.; Kumar, V., Anomaly detection: a survey, ACM Comput. Surv. (2009)
[7] Chandola, V.; Banerjee, A.; Kumar, V., Anomaly detection for discrete sequences: a survey, IEEE Trans. Knowl. Data Eng., 823-839 (2012)
[8] Chen, F.; Tan, P.; Jain, A. K., A co-classification framework for detecting web spam and spammers in social media web sites, CIKM, 1807-1810 (2009)
[9] Chen, W.; Collins, A.; Rachel, C.; Ke, T.; Liu, Z.; Rincon, D.; Sun, X.; Wang, Y.; Wei, W.; Yuan, Y., Influence maximization in social networks when negative opinions may emerge and propagate, SDM, 379-390 (2011)
[10] Chen, W.; Lu, W.; Zhang, N., Time-critical influence maximization in social networks with time-delayed diffusion process, AAAI (2012)
[11] Chen, W.; Wang, C.; Wang, Y., Scalable influence maximization for prevalent viral marketing in large-scale social networks, SIGKDD, 1029-1038 (2010)
[12] Chen, W.; Wang, Y. S., Efficient influence maximization in social networks, SIGKDD, 199-208 (2009)
[13] Das, S.; Matthews, B. L.; Srivastava, A. N.; Oza, N. C., Multiple kernel learning for heterogeneous anomaly detection: algorithm and aviation safety case study, SIGKDD, 47-56 (2010)
[14] Domingos, P.; Richardson, M., Mining the network value of customers, SIGKDD, 57-66 (2001)
[15] Du, N.; Song, L.; Gomez-Rodriguez, M.; Zha, H., Scalable Influence Estimation in Continuous-time Diffusion Networks, NIPS, 3147-3155 (2013)
[16] Faloutsos, C.; Ranganathan, M.; Manolopoulos, Y., Fast subsequence matching in time-series databases, SIGMOD, 419-429 (1994)
[17] Fujimaki, R.; Nakata, T.; Tsukahara, H.; Sato, A.; Yamanishi, K., Mining abnormal patterns from heterogeneous time-series with irrelevant features for fault event detection, Stat. Anal. Data Min., 1-17 (2009) · Zbl 1166.62066
[18] Fung, G. P.C.; Yu, J. X.; Yu, P. S.; Lu, H., Parameter free bursty events detection in text streams, VLDB, 181-192 (2005)
[19] Gao, J.; Tan, P., Converting output scores from outlier detection algorithms into probability estimates, ICDM’06, 212-221 (2006)
[20] Gomez-Rodriguez, M.; Balduzzi, D.; Schölkopf, B., Uncovering the temporal dynamics of diffusion networks, ICML (2011)
[21] Gomez-Rodriguez, M.; Schölkopf, B., Influence maximization in continuous time diffusion networks, CML (2012)
[22] Gupta, M.; Gao, J.; Sun, Y.; Han, J., Community trend outlier detection using soft temporal pattern mining, ECML/PKDD (2)’12, 692-708 (2012)
[23] Ha, J.; Kim, S.; Kim, S.; Faloutsos, C.; Park, S., An analysis on information diffusion through blogcast in a blogosphere, Inf. Sci., 290, 45-62 (2015)
[24] He, X.; Song, G.; Chen, W.; Jiang, Q., Influence blocking maximization in social networks under the competitive linear threshold model, SDM, 463-474 (2012)
[25] Kempe, D.; Kleinberg, J.; Tardos, E., Maximizing the spread of influence through a social network, SIGKDD, 137-146 (2003)
[26] Kleinberg, J., Bursty and hierarchical structure in streams, SIGKDD, 91-101 (2002)
[27] Kong, X.; Wu, Z.; Li, L.; Zhang, R.; Yu, P. S.; Wu, H.; Fan, W., Large-scale multi-label learning with incomplete label assignments, SDM (2014)
[28] Kong, X.; Yu, P. S.; Wang, X.; Ragin, A. B., Discriminative feature selection for uncertain graph classification, SDM, 82-93 (2013)
[29] Kong, X.; Zhang, J.; Yu, P. S., Inferring anchor links across multiple heterogeneous social networks, CIKM, 179-188 (2013)
[30] Kriegel, H. P.; Kroger, P.; Zimek, A., Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering, ACM Trans. Knowl. Discov., 3, 1, 1:1-1:58 (2009)
[31] Lee, K.; Caverlee, J.; Webb, S., Uncovering social spammers: social honeypots + machine learning, SIGIR, 435-442 (2010)
[32] Li, Y.; Qian, M.; Jin, D.; Hui, P.; Vasilakos, A. V., Revealing the efficiency of information diffusion in online social networks of microblog, Inf. Sci., 293, 383-389 (2015)
[33] De Melo, P. O.S. V.; Akoglu, L.; Faloutsos, C.; Loureiro, A. A.F., Surprising patterns for the call duration distribution of mobile phone users, ECML/PKDD (3)’10, 354-369 (2010)
[34] Mossel, E.; Roch, S., On the submodularity of influence in social networks, STOC, 128-134 (2007) · Zbl 1232.68183
[35] Mukherjee, A.; Kumar, A.; Liu, B.; Wang, J.; Hsu, M.; Castellanos, M.; Ghosh, R., Spotting opinion spammers using behavioral footprints, SIGKDD, 632-640 (2013)
[36] Pokrajac, D.; Lazarevic, A.; Latecki, L. J., Incremental local outlier detection for data streams, CIDM’07, 504-515 (2007)
[37] Prakash, B. A.; Tong, H.; Valler, N.; Faloutsos, M.; Faloutsos, C., Virus propagation on time-varying networks: theory and immunization algorithms, PKDD, 99-114 (2010)
[38] Qu, Q.; Chen, C.; Jensen, C. S.; Skovsgaard, A., Space-time aware behavioral topic modeling for microblog posts, IEEE Data Eng. Bull., 38, 2, 58-67 (2015)
[39] Qu, Q.; Liu, S.; Zhu, F.; Jensen, C. S., Efficient online summarization of large-scale dynamic networks, IEEE Trans. Knowl. Data Eng., 28, 12, 3231-3245 (2016)
[40] Richardson, M.; Domingos, P., Mining knowledge-sharing sites for viral marketing, SIGKDD, 61-70 (2002)
[41] Schubert, E.; Zimek, A.; Kriegel, H., Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection, Data Min. Knowl. Discov., 190-237 (2014) · Zbl 1281.68192
[42] Seshadri, M.; Machiraju, S.; Sridharan, A.; Bolot, J.; Faloutsos, C.; Leskove, J., Mobile call graphs: beyond power-law and lognormal distributions, SIGKDD, 596-604 (2008)
[43] Singh, L.; Sayal, M., Privately detecting bursts in streaming, distributed time series data, Data Knowl. Eng., 68, 6, 509-530 (2009)
[44] Tang, J.; Sun, J.; Wang, C.; Yang, Z., Social influence analysis in large-scale networks, SIGKDD, 807-816 (2009)
[45] Tang, Y.; Shi, Y.; Xiao, X., Influence maximization in near-linear time: a martingale approach, SIGMOD, 1539-1554 (2015)
[46] Wang, Y.; Cong, G.; Song, G.; Xie, K., Community-based greedy algorithm for mining top-k influential nodes in mobile social networks, SIGKDD, 1039-1048 (2010)
[47] Yamanishi, K.; Takeuchi, J., A unifying framework for detecting outliers and change points from non-stationary time series data, SIGKDD, 676-681 (2002)
[48] Zhang, J.; Kong, X.; Yu, P. S., Transferring heterogeneous links across location-based social networks, WSDM, 303-312 (2014)
[49] Zhang, X.; Shasha, D., Better burst detection, ICDE (2006)
[50] Zhang, Y.; Jiang, Q.; Zhang, L.; Zhu, Y., Exploiting bidirectional links: making spamming detection easier, CIKM, 1839-1842 (2009)
[51] Zhou, F.; Jiao, J.; Lei, B., A linear threshold-hurdle model for product adoption prediction incorporating social network effects, Inf. Sci., 307, 95-109 (2015)
[52] Zhu, L.; Zhao, H.; Wang, H., Complex dynamic behavior of a rumor propagation model with spatial-temporal diffusion terms, Inf. Sci., 349350, 119-136 (2016) · Zbl 1398.91489
[53] Zhu, Y.; Wang, X.; Zhong, E.; Liu, N. N.; Li, H.; Yang, Q., Discovering spammers in social networks, AAAI (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.