Multi-level anomaly detection: relevance of big data analytics in networks. (English) Zbl 1339.68018

Summary: The Internet has become a vital source of information; internal and external attacks threaten the integrity of the LAN connected to the Internet. In this work, several techniques have been described for detection of such threats. We have focussed on anomaly-based intrusion detection in the campus environment at the network edge. A campus LAN consisting of more than 9000 users with a 90 Mbps internet access link is a large network. Therefore, efficient techniques are required to handle such big data and to model user behaviour. Proxy server logs of a campus LAN and edge router traces have been used for anomalies like abusive Internet access, systematic downloading (internal threats) and DDoS attacks (external threat); our techniques involve machine learning and time series analysis applied at different layers in TCP/IP stack. Accuracy of our techniques has been demonstrated through extensive experimentation on huge and varied datasets. All the techniques are applicable at the edge and can be integrated into a Network Intrusion Detection System.


68M11 Internet topics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence


Full Text: DOI Link


[1] Al-Nashif Y, Kumar A A, Hariri S, Qu G, Luo Y and Szidarovsky F 2008 Multi-level intrusion detection system. In: International Conference on Autonomic Computing
[2] Arshadi L and Jahangir A -H 2011 Entropy based syn flooding detection. In: Local Computer Networks (LCN), 2011 IEEE 36th Conference on. IEEE
[3] Baker G and Tenopir C 2006 Managing the unmanageable: Systematic downloading of electronic resources by library users. J. Library Admin. 44: 11-24
[4] Berry M W, Browne M, Langville A N, Pauca V P and Plemmons R J 2007 Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52 (1): 155-173 · Zbl 1452.90298
[5] Bhandari A, Khare S, Murthy H et al 2014 Systematic downloading: Analysis and detection. In: Signal Processing and Communication Systems (ICSPCS), 2014 8th International Conference on. IEEE
[6] Blanco R and Lioma C 2007 Random walk term weighting for information retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM · Zbl 1452.90298
[7] Bommepally K, Glisa T, Prakash J, Singh S and Murthy H 2010 Internet activity analysis through proxy log. In: National Conference on Communications (NCC)
[8] Canny J 1986 A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 8 (6): 679-698
[9] Chin S C, Ray A and Rajagopalan V 2005 Symbolic time series analysis for anomaly detection: A comparative evaluation. Signal Process. 85 (9): 1859-1868. ISSN 0165-1684. URL http://www.sciencedirect.com/science/article/pii/S0165168405001039 · Zbl 1160.94323
[10] Choi B and Yao Z 2005 Web page classification*. In: Foundations and Advances in Data Mining. Springer, 221-274
[11] Chu S -I and Chang S -C 2007 Time-of-day internet-access management by combining empirical data-based pricing with quota-based priority control. IET Commun. 1: 587-596
[12] Deerwester S, Dumais S T, Furnas G W, Landauer T K and Harshman R 1990 Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41: 391-407
[13] Dini G, Fabio M, Saracino A and Sgandurra D 2012 MADAM: A multi-level anomaly detector for android malware. Lecture Notes in Computer Science 7531: 240-253
[14] Divakaran D, Murthy H and Gonsalves T 2006a Detection of syn flooding attacks using linear prediction analysis. In: Networks, 2006. ICON ’06. 14th IEEE International Conference on, volume 1. ISSN 1556-6463
[15] Divakaran D M, Murthy H A and Gonsalves T A 2006b Detection of syn flooding attacks using linear prediction analysis. In: Networks, 2006. ICON’06. 14th IEEE International Conference on, volume 1. IEEE
[16] Divakaran D M, Murthy H A and Gonsalves T A 2006c Detection of SYN flooding attacks using linear prediction analysis. In: International Conference on Networks (ICON)
[17] Garcia-Teodoro P, Verdejo J D, Fernandez G M and Vazquez E 2009 Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Security 28: 18-28
[18] Guirguis M, Bestavros A, Matta I and Zhang Y 2005a Reduction of quality (roq) attacks on internet end-systems. In: INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, volume 2. IEEE
[19] Guirguis M, Bestavros A, Matta I and Zhang Y 2005b Reduction of Quality (RoQ) Attacks on Internet End-Systems. In: International Conference on Computer Communication (INFOCOM), volume 2
[20] Hassan S, Mihalcea R and Banea C 2007 Random walk term weighting for improved text classification. Int. J. Semantic Comput. 1 (04): 421-439
[21] He Z and Liu Z 2008 A novel approach to naive bayes web page automatic classification. In: Fuzzy Systems and Knowledge Discovery. 2008 FSKD ’08. Fifth International Conference on, volume 2
[22] James C and Murthy H A 2012 Decoupling non-stationary and stationary components in long range network time series in the context of anomaly detection. In: Local Computer Networks (LCN). 2012 IEEE 37th Conference on. IEEE
[23] Kan M -Y and Thi H O N 2005 Fast webpage classification using url features. In: Proceedings of the 14th ACM international conference on Information and knowledge management, CIKM ’05. ACM. ISBN 1-59593-140-6
[24] Khare S, Bhandari A and Murthy H A 2014 Url classification using non negative matrix factorization. In: Communications (NCC), 2014 Twentieth National Conference on. IEEE
[25] Kumar A, Hegde M, Anand S, Bindu B, Thirumurthy D and Kherani A 2000 Nonintrusive TCP connection admission control for bandwidth management of an internet access link. 38: 160-167
[26] Lee D D and Seung H S 2001 Algorithms for non-negative matrix factorization. In: NIPS. MIT Press
[27] Leland W E, Taqqu M S, Willinger W and Wilson D V 1994 On the self-similar nature of Ethernet traffic (extended version). IEEE/ACMTrans. Netw. 2 (1): 1-15
[28] Lin T -C, Sun Y, Chang S -C, Chu S -I, Chou Y -T and Li M -W 2004 Management of abusive and unfair internet access by quota-based priority control. Comput. Netw. Int. J. Comput. Telecommun. Netw. 44: 441-462
[29] Liu H and Kim M S 2010a Real-time detection of stealthy ddos attacks using time-series decomposition. In: Communications (ICC), 2010 IEEE International Conference on. ISSN. 1550-3607
[30] Liu H and Kim M S 2010b Real-time detection of stealthy DDOS attacks using time-series decomposition. In: International Conference on Communications (ICC)
[31] Mukherjee B, Heberlein L T and Levitt K N 1994 Network intrusion detection. IEEE Netw. 8: 26-41
[32] Ndousse T and Okuda T 1996 Computational intelligence for distributed fault management in networks using fuzzy cognitive maps. In: Communications, 1996. ICC ’96, Conference Record, Converging Technologies for Tomorrow’s Applications. 1996 IEEE International Conference on, volume 3
[33] Paine T A and Griggs T J 2008 Directing traffic: Managing internet bandwidth fairly. EDUCAUSE Q. 3: 66-70
[34] Paliwal K 1992 On the use of line spectral frequency parameters for speech recognition. Digital Signal Process. 2 (2): 80-87
[35] Paxson V and Floyd S 1995 Wide- Area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3 (3): 226-244
[36] Peng H, Long F and Ding C 2005 Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. 27: 1226-1238
[37] Porter M F 1980 An algorithm for suffix stripping. Program: electronic library and information systems 14 (3): 130-137
[38] Qi X and Davison B D 2009 Web page classification: Features and algorithms. ACM Comput. Surv. 41 (2): 12:1-12:31. ISSN 0360-0300. URL http://doi.acm.org/10.1145/1459352.1459357
[39] Rabiner L R and Gold B 1975 Theory and application of digital signal processing. Englewood Cliffs, NJ, Prentice-Hall, Inc., 1975, 777 p., 1
[40] Ranjan N, Murthy H A and Gonsalves T A 2010 Detection of SYN flooding attacks using Generalized Autoregressive Conditional Heteroskedasticity (GARCH) modeling technique. In: National Conference on Communications (NCC)
[41] Reynolds D, Quatieri T and Dunn R 2000 Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10: 19-41
[42] Salton G and Buckley C 1988 Term-weighting approaches in automatic text retrieval. In: Information processing and management
[43] Seresht N A and Azmi R 2014 MAIS-IDS: A distributed intrusion detection system using multi-agent AIS approach. Eng. Appl. Artif. Intell. 35: 286-298
[44] Singh S R, Murthy H A and Gonsalves T A 2010 Feature selection for text classification based on gini coefficient of inequality. In: H Liu, H Motoda, R Setiono and Z Zhao (eds.), FSDM, volume 10 of JMLR Proceedings. JMLR.org
[45] Siris V A and Papagalou F 2004 Application of anomaly detection algorithms for detecting SYN flooding attacks. In: Global Telecommunications Conference (GLOBECOM)
[46] Siris V A and Papagalou F 2006 Application of anomaly detection algorithms for detecting syn flooding attacks. Comput. Commun. 29 (9): 1433-1442
[47] Tax D M J and Duin R P W 2004 Support vector data description. Mach. Learn. 54 (1): 45-66. ISSN 0885-6125. URL http://dx.doi.org/10.1023/B:MACH.0000008084.60811.49 · Zbl 1078.68728
[48] TCPDUMP 1999 http://www.tcpdump.org/
[49] Thottan M and Ji C 1998 Proactive anomaly detection using distributed intelligent agents. Netw. IEEE 12 (5): 21-27
[50] Thottan M and Ji C 2003 Anomaly detection in ip networks. IEEE Trans. Signal Process. 51 (8): 2191-2204. ISSN 1053-587X
[51] Wang H, Zhang D and Shin K 2002a SYN-dog: Sniffing SYN flooding sources. In: Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS)
[52] Wang H, Zhang D and Shin K G 2002b Syn-dog: Sniffing syn flooding sources. In: Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on. IEEE · Zbl 1078.68728
[53] Wu Q and Shao Z 2005 Network anomaly detection using time series analysis. In: Autonomic and Autonomous Systems and International Conference on Networking and Services, 2005. ICAS-ICNS 2005. Joint International Conference on
[54] Ye N, Vilbert S and Chen Q 2003 Computer intrusion detection through ewma for autocorrelated and uncorrelated data. Reliability, IEEE Transactions on 52 (1): 75-82
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.