×

zbMATH — the first resource for mathematics

Z3str2: an efficient solver for strings, regular expressions, and length constraints. (English) Zbl 1360.68773
Summary: In recent years, string solvers have become an essential component in many formal verification, security analysis, and bug-finding tools. Such solvers typically support a theory of string equations, the length function, and the regular-expression membership predicate. These enable considerable expressive power, which comes at the cost of slow solving time, and in some cases even non-termination. We present three techniques, designed for word-based SMT string solvers, to mitigate these problems: (1) detecting overlapping variables, which is essential to avoiding common cases of non-termination; (2) pruning of the search space via bi-directional integration between the string and integer theories, enabling new cross-domain heuristics; and (3) a binary search based heuristic, allowing the procedure to skip unnecessary string length queries and converge on consistent length assignments faster for large strings. We have implemented above techniques atop the Z3-str solver, resulting in a significantly more robust and efficient solver, dubbed Z3str2, for the quantifier-free theory of string equations, the regular-expression membership predicate, and linear arithmetic over the length function. We report on a series of experiments over four sets of challenging real-world benchmarks, where we compare Z3str2 with five different string solvers: S3, CVC4, Kaluza, PISA and Stranger. Each of these tools utilizes a different solving strategy and/or string representation (based e.g. on words, bit vectors or automata). The results point to the efficacy of our proposed techniques, which yield dramatic performance improvement. We also demonstrate performance improvements enabled by Z3str2 in the context of symbolic execution for string-manipulating programs. We observe that the techniques presented here are of broad applicability, and can be integrated into other string solvers to improve their performance.
MSC:
68T15 Theorem proving (deduction, resolution, etc.) (MSC2010)
68W32 Algorithms on strings
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Z3str2 String Constraint Solver. https://sites.google.com/site/z3strsolver/
[2] IBM Security AppScan Source. http://www-03.ibm.com/software/products/en/appscan-source · Zbl 1192.68372
[3] Kausler Suite. https://github.com/BoiseState/string-constraint-solvers
[4] LibStranger. https://github.com/vlab-cs-ucsb/LibStranger
[5] Personal communications with the Stranger team. 2015
[6] Abdulla PA, Atig MF, Chen Y-F, Holík L, Rezine A, Rümmer P, Stenman J (2014) String constraints for verification. In: Proceedings of the 26th international conference on computer aided verification, CAV’14, pp 150-166
[7] Barrett C, Fontaine P, Tinelli C (2016) The Satisfiability Modulo Theories Library (SMT-LIB). http://www.SMT-LIB.org
[8] Bjørner N, Tillmann N, Voronkov A (2009) Path feasibility analysis for string-manipulating programs. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems, TACAS ’09, pp 307-321 · Zbl 1234.68070
[9] Cristian C, Daniel D, Dawson E (2008) Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX conference on operating systems design and implementation, OSDI’08. USENIX Association, Berkeley, pp 209-224
[10] Chipounov V, Kuznetsov V, Candea G (2011) S2e: a platform for in-vivo multi-path analysis of software systems. In: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems. ASPLOS XVI. ACM, New York, pp 265-278
[11] Christensen AS, Møller A, Schwartzbach MI (2003) Precise analysis of string expressions. In: Proceedings of the 10th international conference on static analysis, SAS’03, pp 1-18 · Zbl 1067.68541
[12] De Moura L, Bjørner N (2008) Z3: an efficient smt solver. In: Proceedings of the theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems, TACAS’08, pp 337-340
[13] Ganesh V, Dill DL (2007) A decision procedure for bit-vectors and arrays. In: Proceedings of the 19th international conference on computer aided verification, CAV’07, pp 519-531 · Zbl 1135.68472
[14] Ganesh V, Minnes M, Solar-Lezama A, Rinard M (2012) Word equations with length constraints: what’s decidable? In: HVC’12
[15] Ghosh I, Shafiei N, Li G, Chiang W-F (2013) JST: an automatic test generation tool for industrial java applications with strings. In: Proceedings of the 2013 international conference on software engineering, ICSE ’13, pp 992-1001
[16] Hooimeijer P, Weimer W (2010) Solving string constraints lazily. In: Proceedings of the IEEE/ACM international conference on automated software engineering, ASE ’10, pp 377-386
[17] Hopcroft JE, Motwani R, Ullman JD (2007) Introduction to automata theory, languages, and computation. Pearson/Addison Wesley, Reading · Zbl 0980.68066
[18] Jeon J, Qiu X, Fetter-Degges J, Foster JS, Solar-Lezama A (2016) Synthesizing framework models for symbolic execution. In: Proceedings of the 38th international conference on software engineering, ICSE ’16. ACM, New York, pp 156-167
[19] Jeż A (2013) Recompression: word equations and beyond. In: Developments in language theory, Lecture Notes in Computer Science, pp 12-26
[20] Karhumäki, J; Mignosi, F; Plandowski, W, The expressibility of languages and relations by word equations, J ACM, 47, 483-505, (2000) · Zbl 1094.68618
[21] Kausler S (2014) Evaluation of string constraint solvers using dynamic symbolic execution. Master’s Thesis, Boise State University
[22] Kausler S, Sherman E (2014) Evaluation of string constraint solvers in the context of symbolic execution. In: Proceedings of the 29th ACM/IEEE international conference on automated software engineering, ASE ’14. ACM, New York, pp 259-270
[23] Kiezun A, Ganesh V, Guo PJ, Hooimeijer P, Ernst MD (2009) Hampi: a solver for string constraints. In: Proceedings of the eighteenth international symposium on software testing and analysis, ISSTA ’09, pp 105-116
[24] Li G, Andreasen E, Ghosh I (2014) SymJS: automatic symbolic testing of javascript web applications. In: Proceedings of the 22Nd ACM SIGSOFT international symposium on foundations of software engineering, FSE 2014, pp 449-459
[25] Li G, Ghosh I (2013) PASS: string solving with parameterized array and interval automaton. In: 9th international Haifa verification conference, HVC ’13, pp 15-31
[26] Liang T, Reynolds A, Tinelli C, Barrett C, Deters M (2014) A dpll(t) theory solver for a theory of strings and regular expressions. In: Proceedings of the 26th international conference on computer aided verification, CAV’14, pp 646-662
[27] Makanin GS (1977) The problem of solvability of equations in a free semigroup. Math Sbornik 103:147-236 (1977). English transl. in Math USSR Sbornik 32
[28] Matiyasevich Y (2007) Word equations, Fibonacci numbers, and Hilbert’s tenth problem. In: Workshop on Fibonacci words
[29] Plandowski, W, Satisfiability of word equations with constants is in pspace, J ACM, 51, 483-496, (2004) · Zbl 1192.68372
[30] Plandowski W (2006) An efficient algorithm for solving word equations. In: Proceedings of the 38th annual ACM symposium on theory of computing, STOC ’06, pp 467-476 · Zbl 1301.68165
[31] Qu X, Robinson B (2011) A case study of concolic testing tools and their limitations. In: 2011 international symposium on empirical software engineering and measurement (ESEM). IEEE Computer Society, pp 117-126
[32] Redelinghuys G, Visser W, Geldenhuys J (2012) Symbolic execution of programs with strings. In: Proceedings of the South African Institute for computer scientists and information technologists conference, SAICSIT ’12, pp 139-148
[33] Saxena P, Akhawe D, Hanna S, Mao F, McCamant S, Song D (2010) A symbolic execution framework for javascript. In: Proceedings of the 2010 IEEE symposium on security and privacy, SP ’10, pp 513-528
[34] Schulz, K; Schulz, K (ed.), Makanin’s algorithm for word equations-two improvements and a generalization, No. 572, 85-150, (1992), Berlin
[35] Tateishi, T; Pistoia, M; Tripp, O, Path- and index-sensitive string analysis based on monadic second-order logic, ACM Trans Softw Eng Methodol, 22, 33:1-33:33, (2013)
[36] Trinh M-T, Chu D-H, Jaffar J (2014) S3: a symbolic string solver for vulnerability detection in web applications. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, CCS ’14, pp 1232-1243 · Zbl 1094.68618
[37] Xie X, Liu Y, Le W, Li X, Chen H (2015) S-looper: automatic summarization for multipath string loops. In: Proceedings of the 2015 international symposium on software testing and analysis, ISSTA 2015. ACM, New York, pp 188-198
[38] Yu F, Alkhalaf M, Bultan T (2010) Stranger: an automata-based string analysis tool for php. In: Proceedings of the 16th international conference on tools and algorithms for the construction and analysis of systems, TACAS’10, pp 154-157
[39] Yu F, Bultan T, Ibarra OH (2009) Symbolic string verification: combining string analysis and size analysis. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems, TACAS ’09, pp 322-336 · Zbl 1234.68269
[40] Zheng Y, Ganesh V, Subramanian S, Tripp O, Dolby J, Zhang X (2015) Effective search-space pruning for solvers of string equations, regular expressions and length constraints. Technical report. https://sites.google.com/site/z3strsolver/publications
[41] Zheng Y, Zhang X, Ganesh V (2013) Z3-str: a z3-based string solver for web application analysis. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, ESEC/FSE 2013, pp 114-124
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.