BotTokenizer: exploring network tokens of HTTP-based botnet using malicious network traces. (English) Zbl 1439.94055

Chen, Xiaofeng (ed.) et al., Information security and cryptology. 13th international conference, Inscrypt 2017, Xi’an, China, November 3–5, 2017. Revised selected papers. Cham: Springer. Lect. Notes Comput. Sci. 10726, 383-403 (2018).
Summary: Nowadays, malicious software and especially botnets leverage HTTP protocol as their communication and command (C & C) channels to connect to the attackers and control compromised clients. Due to its large popularity and facility across firewall, the malicious traffic can blend with legitimate traffic and remains undetected. While network signature-based detection systems and models show extraordinary advantages, such as high detection efficiency and accuracy, their scalability and automatization still need to be improved.
In this work, we present BotTokenizer, a novel network signature-based detection system that aims to detect malicious HTTP C & C traffic. BotTokenizer automatically learns recognizable network tokens from known HTTP C & C communications from different botnet families by using words segmentation technologies. In essence, BotTokenizer implements a coarse-grained network signature generation prototype only relying on Uniform Resource Locators (URLs) in HTTP requests. Our evaluation results demonstrate that BotTokenizer performs very well on identifying HTTP-based botnets with an acceptable classification errors.
For the entire collection see [Zbl 1387.94003].


94A60 Cryptography
Full Text: DOI


[1] Antonakakis, M., Demar, J., Stevens, K., Dagon, D.: Unveiling the network criminal infrastructure of TDSS/TDL4. Damballa Research Report 2012 (2012)
[2] Chiba, D., Yagi, T., Akiyama, M., Aoki, K., Hariu, T., Goto, S.: BotProfiler: profiling variability of substrings in HTTP requests to detect malware-infected hosts. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 758-765. IEEE (2015)
[3] Garcia, S.; Grill, M.; Stiborek, J.; Zunino, A., An empirical comparison of botnet detection methods, Comput. Secur., 45, 100-123, 2014 · doi:10.1016/j.cose.2014.05.011
[4] Goebel, J.; Holz, T., Rishi: identify bot contaminated hosts by IRC nickname evaluation, HotBots, 7, 8-8, 2007
[5] Goodman, N.: A survey of advances in botnet technologies. arXiv preprint arXiv:1702.01132 (2017)
[6] Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: BotMiner: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, vol. 5, pp. 139-154 (2008)
[7] Gu, G., Yegneswaran, V., Porras, P., Stoll, J., Lee, W.: Active botnet probing to identify obscure command and control channels. In: Annual Computer Security Applications Conference, ACSAC 2009, pp. 241-253. IEEE (2009)
[8] Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic (2008)
[9] Han, X., Kheir, N., Balzarotti, D.: PhishEye: live monitoring of sandboxed phishing kits. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1402-1413. ACM (2016)
[10] Jang, J., Brumley, D., Venkataraman, S.: BitShred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of the 18th ACM Conference on Computer and Communications security, pp. 309-320. ACM (2011)
[11] Kim, H.A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: USENIX Security Symposium, San Diego, CA, vol. 286 (2004)
[12] Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.: Behavior-based spyware detection. In: USENIX Security, vol. 6 (2006)
[13] Li, Z., Sanghi, M., Chen, Y., Kao, M.Y., Chavez, B.: Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience. In: 2006 IEEE Symposium on Security and Privacy, pp. 15. IEEE (2006)
[14] Lu, W.; Rammidi, G.; Ghorbani, AA, Clustering botnet communication traffic based on n-gram feature selection, Comput. Commun., 34, 3, 502-514, 2011 · doi:10.1016/j.comcom.2010.04.007
[15] Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 681-688. ACM (2009)
[16] Malan, D.J., Smith, M.D.: Host-based detection of worms through peer-to-peer cooperation. In: Proceedings of the 2005 ACM Workshop on Rapid Malcode, pp. 72-80. ACM (2005)
[17] Nelms, T., Perdisci, R., Ahamad, M.: ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates. In: USENIX Security, pp. 589-604 (2013)
[18] Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: 2005 IEEE Symposium on Security and Privacy, pp. 226-241. IEEE (2005)
[19] Perdisci, R.; Ariu, D.; Giacinto, G., Scalable fine-grained behavioral clustering of HTTP-based malware, Comput. Netw., 57, 2, 487-500, 2013 · doi:10.1016/j.comnet.2012.06.022
[20] Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In: NSDI, vol. 10, p. 14 (2010)
[21] Perdisci, R., et al.: VAMO: towards a fully automated malware clustering validity analysis. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 329-338. ACM (2012)
[22] Rafique, MZ; Caballero, J.; Stolfo, SJ; Stavrou, A.; Wright, CV, FIRMA: malware clustering and network signature generation with mixed network behaviors, Research in Attacks, Intrusions, and Defenses, 144-163, 2013, Heidelberg: Springer, Heidelberg · doi:10.1007/978-3-642-41284-4_8
[23] Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., Hakimian, P.: Detecting P2P botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), pp. 174-180. IEEE (2011)
[24] Sakib, M.N., Huang, C.T.: Using anomaly detection based techniques to detect HTTP-based botnet C&C traffic. In: 2016 IEEE International Conference on Communications (ICC), pp. 1-6. IEEE (2016)
[25] Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: OSDI, vol. 4, p. 4 (2004)
[26] Small, S., Mason, J., Monrose, F., Provos, N., Stubblefield, A.: To catch a predator: a natural language approach for eliciting malicious payloads. In: USENIX Security Symposium, pp. 171-184 (2008)
[27] Sourdis, I.; Pnevmatikatos, D.; Y. K. Cheung, P.; Constantinides, GA, Fast, large-scale string match for a 10 Gbps FPGA-based network intrusion detection system, Field Programmable Logic and Application, 880-889, 2003, Heidelberg: Springer, Heidelberg · doi:10.1007/978-3-540-45234-8_85
[28] Spitzner, L., The honeynet project: trapping the hackers, IEEE Secur. Priv., 99, 2, 15-23, 2003 · doi:10.1109/MSECP.2003.1193207
[29] Wang, K.; Cretu, G.; Stolfo, SJ; Valdes, A.; Zamboni, D., Anomalous payload-based worm detection and signature generation, Recent Advances in Intrusion Detection, 227-246, 2006, Heidelberg: Springer, Heidelberg · doi:10.1007/11663812_12
[30] Wang, X., Zheng, K., Niu, X., Wu, B., Wu, C.: Detection of command and control in advanced persistent threat based on independent access. In: 2016 IEEE International Conference on Communications (ICC), pp. 1-6. IEEE (2016)
[31] Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries, pp. 254-255. ACM (1999)
[32] Wurzinger, P.; Bilge, L.; Holz, T.; Goebel, J.; Kruegel, C.; Kirda, E.; Backes, M.; Ning, P., Automatically generating models for botnet detection, Computer Security - ESORICS 2009, 232-249, 2009, Heidelberg: Springer, Heidelberg · doi:10.1007/978-3-642-04444-1_15
[33] Xie, Y.; Yu, F.; Achan, K.; Panigrahy, R.; Hulten, G.; Osipkov, I., Spamming botnets: signatures and characteristics, ACM SIGCOMM Comput. Commun. Rev., 38, 4, 171-182, 2008 · doi:10.1145/1402946.1402979
[34] Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 116-127. ACM (2007)
[35] Zand, A., Vigna, G., Yan, X., Kruegel, C.: Extracting probable command and control signatures for detecting botnets. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 1657-1662. ACM (2014)
[36] Zarras, A., Papadogiannakis, A., Gawlik, R., Holz, T.: Automated generation of models for fast and precise detection of http-based malware. In: 2014 Twelfth Annual International Conference on Privacy, Security and Trust (PST), pp. 249-256. IEEE (2014)
[37] Zeidanloo, H.R., Manaf, A.B.A.: Botnet detection by monitoring similar communication patterns. arXiv preprint arXiv:1004.1232 (2010)
[38] Zeng, Y., Hu, X., Shin, K.G.: Detection of botnets using combined host-and network-level information. In: 2010 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 291-300. IEEE (2010)
[39] Zhang, J., Perdisci, R., Lee, W., Sarfraz, U., Luo, X.: Detecting stealthy P2P botnets using statistical traffic fingerprints. In: 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), pp. 121-132. IEEE (2011)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.