Sequential pattern mining for ICT risk assessment and management. (English) Zbl 1408.68129

Summary: ICT risk assessment and management relies on the analysis of data on the joint behavior of a target system and its attackers. The tools in the Haruspex suite model intelligent, goal-oriented attackers that reach their goals through sequences of attacks. The tools synthetically generate these sequences through a Monte Carlo method that runs multiple simulations of the attacker behavior. This paper presents a sequential pattern mining analysis of the attack sequence database to extract a high-level and succinct understanding of the attacker strategies against the system to assess. Such an understanding is expressed as a set of sequential patterns that cover, and possibly partition, the attack sequences. This set can be extracted in isolation, or in contrast with the behavior of other attackers. In the latter case, the patterns represent a signature of the behavior of an attacker. The dynamic tools of the suite use this signature to deploy dynamic counter-measures that reduce the security risk. We formally motivate the need for using the class of maximal sequential patterns in covering attack sequences, instead of frequent or closed sequential patterns. When contrasting the behavior of different attackers, we resort to distinguishing sequential patterns. We report an extensive experimentation on a system with 36 nodes, 6 attackers, and 600K attack sequences.


68T10 Pattern recognition, speech recognition
68P15 Database theory
68T05 Learning and adaptive systems in artificial intelligence


Full Text: DOI


[1] SP 800-30 Revision 1: Guide for Conducting Risk Assessments, (2012), National Institute of Standards & Technology
[2] Baiardi, F.; Telmon, C.; Sgandurra, D., Haruspex: simulation-driven risk analysis for complex systems, ISACA J., 3, 46-51, (2012)
[3] Lee, W.; Stolfo, S. J.; Mok, K. W., Adaptive intrusion detection: a data mining approach, Artif. Intell. Rev., 14, 6, 533-567, (2000) · Zbl 0984.68545
[4] Katipally, R.; Gasior, W.; Cui, X.; Yang, L., Multistage attack detection system for network administrators using data mining, (Proc. of the Cyber Security and Information Intelligence Research Workshop, CSIIRW 2010, vol. 51, (2010), ACM)
[5] Brahmi, H.; Yahia, S. B., Discovering multi-stage attacks using closed multi-dimensional sequential pattern mining, (Proc. of Int. Conf. on Database and Expert Systems Applications, DEXA 2013, Lect. Notes Comput. Sci., vol. 8056, (2013), Springer), 450-457
[6] Fournier-Viger, P.; Lin, J. C.-W.; Kiran, R. U.; Koh, Y. S.; Thomas, R., A survey of sequential pattern mining, Data Sci. Pattern Recognit., 1, 54-77, (2017)
[7] Mabroukeh, N. R.; Ezeife, C. I., A taxonomy of sequential pattern mining algorithms, ACM Comput. Surv., 43, 1, 3:1-3:41, (2010)
[8] Mooney, C.; Roddick, J. F., Sequential pattern mining - approaches and algorithms, ACM Comput. Surv., 45, 2, 19:1-19:39, (2013) · Zbl 1293.68246
[9] Hochbaum, D. S., Approximating covering and packing problems: set cover, vertex cover, independent set, and related problems, (Hochbaum, D. S., Approximation Algorithms for NP-Hard Problems, (1997), PWS Publishing Co.), 94-143
[10] Ji, X.; Bailey, J.; Dong, G., Mining minimal distinguishing subsequence patterns with gap constraints, Knowl. Inf. Syst., 11, 3, 259-286, (2007)
[11] Liang, Z.; Sekar, R., Fast and automated generation of attack signatures: a basis for building self-protecting servers, (ACM Conference on Computer and Communications Security, (2005), ACM), 213-222
[12] Baiardi, F.; Corò, F.; Tonelli, F.; Sgandurra, D., A scenario method to automatically assess ICT risk, (Euromicro Int. Conf. on Parallel, Distributed, and Network-Based Processing, PDP 2014, (2014), IEEE Computer Society), 544-551
[13] NIST, National Vulnerability Database
[14] MITRE, Common Weakness Enumeration
[15] Schiffman, M., Common Vulnerability Scoring System
[16] Baiardi, F.; Tonelli, F.; Bertolini, A., Cyvar: extending Var-At-Risk to ICT, (Proc. of the Int. Workshop on Risk Assessment and Risk-Driven Testing, RISK 2015, Lect. Notes Comput. Sci., vol. 9488, (2015), Springer), 49-62
[17] Baiardi, F.; Tonelli, F.; Di Biase, A. D.R., Assessing and managing risk by simulating attack chains, (PDP, (2016), IEEE Computer Society), 581-584
[18] Fournier-Viger, P.; Gomariz, A.; Gueniche, T.; Soltani, A.; Wu, C.; Tseng, V. S., SPMF: a Java open-source pattern mining library, J. Mach. Learn. Res., 15, 3389-3393, (2014) · Zbl 1310.68178
[19] Baiardi, F.; Lipilini, J.; Tonelli, F., Using s-rules to fire dynamic countermeasures, (2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP, (2017)), 371-375
[20] Srikant, R.; Agrawal, R., Mining sequential patterns: generalizations and performance improvements, (EDBT, Lecture Notes in Computer Science, vol. 1057, (1996), Springer), 3-17
[21] Fumarola, F.; Lanotte, P. F.; Ceci, M.; Malerba, D., CloFAST: closed sequential pattern mining using sparse and vertical id-lists, Knowl. Inf. Syst., 48, 2, 429-463, (2016)
[22] Agrawal, R.; Srikant, R., Mining sequential patterns, (ICDE, (1995), IEEE Computer Society), 3-14
[23] Chen, J.; Cook, T., Mining contiguous sequential patterns from web logs, (WWW, (2007), ACM), 1177-1178
[24] Norouzi, M.; Souri, A.; SamadZamini, M. S., A data mining classification approach for behavioral malware detection, J. Comput. Netw. Commun., 2016, 8069672:1-8069672:9, (2016)
[25] Wang, J.; Han, J.; Li, C., Frequent closed sequence mining without candidate maintenance, IEEE Trans. Knowl. Data Eng., 19, 8, 1042-1056, (2007)
[26] Grossi, V.; Romei, A.; Ruggieri, S., A case study in sequential pattern mining for it-operational risk, (ECML/PKDD (1), Lecture Notes in Computer Science, vol. 5211, (2008), Springer), 424-439
[27] Luo, C.; Chung, S. M., Parallel mining of maximal sequential patterns using multiple samples, J. Supercomput., 59, 2, 852-881, (2012)
[28] Qin, P.; Duan, L.; Zhang, T.; Wang, P., A parallel algorithm for mining density-aware distinguishing sequential patterns with Spark, (CBD, (2016), IEEE Computer Society), 144-149
[29] Petitjean, F.; Li, T.; Tatti, N.; Webb, G. I., Skopus: mining top-k sequential patterns under leverage, Data Min. Knowl. Discov., 30, 5, 1086-1111, (2016)
[30] Grossi, V.; Romei, A.; Turini, F., Survey on using constraints in data mining, Data Min. Knowl. Discov., 31, 2, 424-464, (2017)
[31] van Leeuwen, M.; Vreeken, J., Mining and using sets of patterns through compression, (Frequent Pattern Mining, (2014), Springer), 165-198 · Zbl 1298.68250
[32] Lam, H. T.; Mörchen, F.; Fradkin, D.; Calders, T., Mining compressing sequential patterns, Stat. Anal. Data Min., 7, 1, 34-52, (2014)
[33] Srinivas, P. G.; Reddy, P. K.; Trinath, A. V.; Sripada, B.; Kiran, R. U., Mining coverage patterns from transactional databases, J. Intell. Inf. Syst., 45, 3, 423-439, (2015)
[34] Zheng, Z.; Wei, W.; Liu, C.; Cao, W.; Cao, L.; Bhatia, M., An effective contrast sequential pattern mining approach to taxpayer behavior analysis, World Wide Web, 19, 4, 633-651, (2016)
[35] Zhou, C.; Cule, B.; Goethals, B., Pattern based sequence classification, IEEE Trans. Knowl. Data Eng., 28, 5, 1285-1298, (2016)
[36] Liang, X.; Xiao, Y., Game theory for network security, IEEE Commun. Surv. Tutor., 15, 1, 472-486, (2013)
[37] van der Aalst, W. M., Process Mining, (2011), Springer · Zbl 1216.68016
[38] van der Aalst, W. M.P.; de Medeiros, A. K.A., Process mining and security: detecting anomalous process executions and checking process conformance, Electron. Notes Theor. Comput. Sci., 121, 3-21, (2005) · Zbl 1272.68348
[39] D’Andreagiovanni, M.; Baiardi, F.; Lipilini, J.; Ruggieri, S.; Tonelli, F., Sequential pattern mining for ICT risk assessment and prevention, (SEFM Workshops, Lecture Notes in Computer Science, vol. 10729, (2017), Springer), 25-39
[40] Pinto, H.; Han, J.; Pei, J.; Wang, K.; Chen, Q.; Dayal, U., Multi-dimensional sequential pattern mining, (CIKM, (2001), ACM), 81-88
[41] Esposito, F.; Mauro, N. D.; Basile, T. M.A.; Ferilli, S., Multi-dimensional relational sequence mining, Fundam. Inform., 89, 1, 23-43, (2008) · Zbl 1155.68484
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.