×

Revisiting multiple pattern matching algorithms for multi-core architecture. (English) Zbl 1280.68059

Summary: Due to the huge size of patterns to be searched, multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection. In this paper, we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures. We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns. The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized. We formulate the problem as an optimal decomposition and scheduling of a pattern set, then propose a heuristic algorithm, which takes advantage of dynamic programming and greedy algorithmic techniques, to solve the optimization problem. Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.

MSC:

68M10 Network design and communication in computer systems
68W10 Parallel algorithms in computer science

Software:

Snort
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Snort: Open-source network ids/ips. http://www.snort.org , 2011.
[2] Villa O, Scarpazza D P, Petrini F. Accelerating real-time string searching with multicore processors. Computer, 2008, 41(4): 42–50. · Zbl 05331939 · doi:10.1109/MC.2008.105
[3] Bu L, Chandy J A. A cam-based keyword match processor architecture. Journal of Microelectronics, 2006, 37(8): 828–836. · doi:10.1016/j.mejo.2005.10.015
[4] Sourdis I, Pnevmatikatos D. Pre-decoded cams for efficient and high-speed nids pattern matching. In Proc. the 12th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM2004), Napa, USA, Apr. 20–23, 2004, pp.258-267.
[5] Antonatos S, Anagnostakis K G, Markatos E P, Polychronakis M. Performance analysis of content matching intrusion detection systems. In Proc. 2004 Symp. Applications and the Internet (SAINT 2004), Tokyo, Japan, Jan. 26–30, 2004, pp.208-218.
[6] Chang C, Paige R. From regular expressions to DFA’s using compressed NFA’s. In Proc. the 3 rd Ann. Symp. Combinatorial Pattern Matching (CPM1992), Tucson, USA, Apr. 29-May 1, 1992, pp.88-108. · Zbl 0912.68105
[7] Sidhu R, Prasanna V K. Fast regular expression matching using FPGAs. In Proc. the 9th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM2001), Rohnert, USA, Mar. 29-Apr. 2, 2001, pp.227-238.
[8] Hutchings B L, Franklin R, Carver D. Assisting network intrusion detection with recon\={}gurable hardware. In Proc. the 10th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM2002), Napa, USA, Apr. 22–24, 2002, pp.111-120.
[9] Jung H J, Baker Z K, Prasanna V K. Performance of FPGA implementation of bit-split architecture for intrusion detection systems. In Proc. the 20th Int. Symp. Parallel and Distributed Processing Symp. (IPDPS 2006), Rhodes Island, Greece, Apr. 25–29, 2006, p.189.
[10] Kaushik R, Govindarajan R. Two-level mapping based cache index selection for packet forwarding engines. In Proc. the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT2006), Seattle, USA, Sept. 16–20, 2006, pp.212-221.
[11] Scarpazza D P, Villa O, Petrini F. Peak-performance DFA-based string matching on the cell processor. In Proc. the 21st Int. Parallel and Distributed Processing Symp. (IPDPS 2007), Long Beach, USA, Mar. 26–30, 2007, pp.1-8.
[12] Gonzalo N, Mathieu R. Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences. Cambridge University Press, New York, NY, USA, 2002. · Zbl 0992.92029
[13] Aho A V, Corasick M J. Efficient string matching: An aid to bibliographic search. Commun. ACM, 1975, 18(6): 333–340. · Zbl 0301.68048 · doi:10.1145/360825.360855
[14] Wu S, Manber U. A fast algorithm for multi-pattern searching. Technical Report TR-94-17, 1994. · Zbl 0807.68037
[15] Beate C W. A string matching algorithm fast on the average. In Proc. the 6th Colloquium on Automata, Languages and Programming, Graz, Austria, Jul. 14–18, 1979, pp.118-132. · Zbl 0407.68092
[16] Liu P, Liu Y, Tan J. A partition-based efficient algorithm for large scale multiple-strings matching. In Proc. the 12th Int. Conf. String Processing and Information Retrieval, Buenos Aires, Argentina, Nov. 2–4, 2005, pp.399-404.
[17] Coffman E G. Computer and Job-Shop Scheduling Theory. Wiley, 1976. · Zbl 0359.90031
[18] Lenstra J K, Kan A R. Complexity of scheduling under precedence constraints. Oper. Res., 1978, 26(1): 22–35. · Zbl 0371.90060 · doi:10.1287/opre.26.1.22
[19] Cormen T H, Leiserson C E, Rivest R L. Introduction to Algorithms. The MIT Press, 2002. · Zbl 1187.68679
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.