×

Dashed strings for string constraint solving. (English) Zbl 07382222

Summary: String processing is ubiquitous across computer science, and arguably more so in web programming – where it is also a critical part of security issues such as injection attacks. In recent years, a number of string solvers have been developed to solve combinatorial problems involving string variables and constraints. We examine the dashed string approach to string constraint solving, which represents an unknown string as a sequence of blocks of characters with bounds on their cardinalities. The solving approach relies on propagation of information about the blocks of characters that arise from reasoning about the constraints in which they occur. This approach shows promising performance on many benchmarks involving constraints like string length, equality, concatenation, and regular expression membership. In this paper, we formally review the definition, the properties and the use of dashed strings for string constraint solving, and we provide an empirical validation that confirms the effectiveness of this approach.

MSC:

68Txx Artificial intelligence
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Emmi, M.; Majumdar, R.; Sen, K., Dynamic test input generation for database applications, (Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) (2007), ACM), 151-162
[2] Bjørner, N.; Tillmann, N.; Voronkov, A., Path feasibility analysis for string-manipulating programs, (Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems, LNCS, vol. 5505 (2009), Springer), 307-321 · Zbl 1234.68070
[3] Gange, G.; Navas, J. A.; Stuckey, P. J.; Søndergaard, H.; Schachte, P., Unbounded model-checking with interpolation for regular language constraints, (Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems, LNCS, vol. 7795 (2013), Springer), 277-291 · Zbl 1381.68162
[4] Bisht, P.; Hinrichs, T. L.; Skrupsky, N.; Venkatakrishnan, V. N., WAPTEC: whitebox analysis of web applications for parameter tampering exploit construction, (Proceedings of ACM Conference on Computer and Communications Security (2011), ACM), 575-586
[5] Barahona, P.; Krippahl, L., Constraint programming in structural bioinformatics, Constraints, 13, 1-2, 3-20 (2008) · Zbl 1144.92011
[6] Hooimeijer, P.; Weimer, W., StrSolve: solving string constraints lazily, Autom. Softw. Eng., 19, 4, 531-559 (2012)
[7] Li, G.; Ghosh, I., PASS: string solving with parameterized array and interval automaton, (Bertacco, V.; Legay, A., Proc. 9th Int. Haifa Verification Conf.. Proc. 9th Int. Haifa Verification Conf., LNCS, vol. 8244 (2013), Springer), 15-31
[8] Tateishi, T.; Pistoia, M.; Tripp, O., Path- and index-sensitive string analysis based on monadic second-order logic, ACM Trans. Softw. Eng. Methodol., 22, 4, Article 33 pp. (2013)
[9] Berzish, M.; Ganesh, V.; Zheng, Y., Z3str3: a string solver with theory-aware heuristics, (Stewart, D.; Weissenbacher, G., Proc. 17th Conf. Formal Methods in Computer-Aided Design (2017), FMCAD Inc.), 55-59
[10] Liang, T.; Reynolds, A.; Tinelli, C.; Barrett, C.; Deters, M., A DPLL(T) theory solver for a theory of strings and regular expressions, (Biere, A.; Bloem, R., Computer Aided Verification: Proc. 26th Int. Conf.. Computer Aided Verification: Proc. 26th Int. Conf., LNCS, vol. 8559 (2014), Springer), 646-662
[11] Kieżun, A.; Ganesh, V.; Artzi, S.; Guo, P. J.; Hooimeijer, P.; Ernst, M. D., HAMPI: a solver for word equations over strings, regular expressions, and context-free grammars, ACM Trans. Softw. Eng. Methodol., 21, 4, Article 25 pp. (2012)
[12] Saxena, P.; Akhawe, D.; Hanna, S.; Mao, F.; McCamant, S.; Song, D., A symbolic execution framework for JavaScript, (Proc. 2010 IEEE Symp. Security and Privacy (2010), IEEE Comp. Soc.), 513-528
[13] Scott, J. D.; Flener, P.; Pearson, J.; Schulte, C., Design and implementation of bounded-length sequence variables, (Lombardi, M.; Salvagnin, D., Proc. 14th Int. Conf. Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming. Proc. 14th Int. Conf. Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming, LNCS, vol. 10335 (2017), Springer), 51-67 · Zbl 06756575
[14] Amadini, R.; Flener, P.; Pearson, J.; Scott, J. D.; Stuckey, P. J.; Tack, G., MiniZinc with strings, (Hermenegildo, M.; López-García, P., LOPSTR 2016: Revised Selected Papers. LOPSTR 2016: Revised Selected Papers, LNCS, vol. 10184 (2017), Springer), 59-75 · Zbl 06791219
[15] Amadini, R.; Gange, G.; Stuckey, P. J.; Tack, G., A novel approach to string constraint solving, (Beck, J. C., Proc. 23rd Int. Conf. Principles and Practice of Constraint Programming. Proc. 23rd Int. Conf. Principles and Practice of Constraint Programming, LNCS, vol. 10416 (2017), Springer), 3-20
[16] Amadini, R.; Gange, G.; Stuckey, P. J., Sweep-based propagation for string constraint solving, (Proc. 32nd AAAI Conf. Artificial Intelligence (2018), AAAI Press), 6557-6564
[17] Amadini, R.; Gange, G.; Stuckey, P. J., Propagating lex, find and replace with dashed strings, (van Hoeve, W.-J., Proc. 15th Int. Conf. Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming. Proc. 15th Int. Conf. Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming, LNCS, vol. 10848 (2018), Springer) · Zbl 06982381
[18] Amadini, R.; Gange, G.; Stuckey, P. J., Propagating regular membership with dashed strings, (Hooker, J., Proc. 24th Conf. Principles and Practice of Constraint Programming. Proc. 24th Conf. Principles and Practice of Constraint Programming, LNCS, vol. 11008 (2018), Springer), 13-29
[19] Amadini, Roberto, G-strings: gecode with string variables (2019), available at
[20] Blotsky, D.; Mora, F.; Berzish, M.; Zheng, Y.; Kabir, I.; Ganesh, V., Stringfuzz: a fuzzer for string solvers, (Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018. Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part II (2018)), 45-51
[21] Michel, C.; Rueher, M.; Lebbah, Y., Solving constraints over floating-point numbers, (Principles and Practice of Constraint Programming - CP 2001. Principles and Practice of Constraint Programming - CP 2001, 7th International Conference, CP 2001, Paphos, Cyprus, November 26 - December 1, 2001, Proceedings (2001)), 524-538 · Zbl 1067.68658
[22] Costantini, G.; Ferrara, P.; Cortesi, A., A suite of abstract domains for static analysis of string values, Softw. Pract. Exp., 45, 2, 245-287 (2015)
[23] Cousot, P.; Cousot, R., Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints, (Proceedings of the Fourth ACM Symposium on Principles of Programming Languages (1977), ACM), 238-252
[24] Aggoun, A.; Beldiceanu, N., Extending CHIP in order to solve complex scheduling and placement problems, Math. Comput. Model., 17, 7, 57-73 (1993)
[25] Amadini, R.; Gange, G.; Stuckey, P. J., Dashed strings and the replace(-all) constraint, (Principles and Practice of Constraint Programming - 26th International Conference. Principles and Practice of Constraint Programming - 26th International Conference, CP 2020, Louvain-la-Neuve, Belgium, September 7-11, 2020, Proceedings. Principles and Practice of Constraint Programming - 26th International Conference. Principles and Practice of Constraint Programming - 26th International Conference, CP 2020, Louvain-la-Neuve, Belgium, September 7-11, 2020, Proceedings, Lecture Notes in Computer Science (2020), Springer)
[26] Barták, R., Modelling resource transitions in constraint-based scheduling, (SOFSEM 2002: Theory and Practice of Informatics. SOFSEM 2002: Theory and Practice of Informatics, 29th Conference on Current Trends in Theory and Practice of Informatics, Milovy, Czech Republic, November 22-29, 2002, Proceedings (2002)), 186-194
[27] Pesant, G., A regular language membership constraint for finite sequences of variables, (Wallace, M., Proceedings of the 10th International Conference on Principles and Practice of Constraint Programming. Proceedings of the 10th International Conference on Principles and Practice of Constraint Programming, LNCS, vol. 3258 (2004), Springer-Verlag), 482-495 · Zbl 1152.68573
[28] Cheng, K.; Yap, R., Maintaining generalized arc consistency on ad hoc r-ary constraints, (14th International Conference on Principles and Process of Constraint Programming. 14th International Conference on Principles and Process of Constraint Programming, LNCS, vol. 5202 (2008)), 509-523
[29] Perez, G.; Régin, J., Improving GAC-4 for table and MDD constraints, (O’Sullivan, B., Principles and Practice of Constraint Programming - 20th International Conference. Principles and Practice of Constraint Programming - 20th International Conference, CP 2014, Lyon, France, September 8-12, 2014, Proceedings. Principles and Practice of Constraint Programming - 20th International Conference. Principles and Practice of Constraint Programming - 20th International Conference, CP 2014, Lyon, France, September 8-12, 2014, Proceedings, Lecture Notes in Computer Science, vol. 8656 (2014), Springer), 606-621
[30] Nethercote, N.; Stuckey, P. J.; Becket, R.; Brand, S.; Duck, G. J.; Tack, G., MiniZinc: towards a standard CP modelling language, (Proceedings of the 13th International Conference on Principles and Practice of Constraint Programming. Proceedings of the 13th International Conference on Principles and Practice of Constraint Programming, LNCS, vol. 4741 (2007), Springer), 529-543
[31] Gecode: generic constraint development environment (2016), available at
[32] Amadini, R.; Andrlon, M.; Gange, G.; Schachte, P.; Søndergaard, H.; Stuckey, P. J., Constraint programming for dynamic symbolic execution of javascript, (Proc. 16th Int. Conf. Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming. Proc. 16th Int. Conf. Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming, LNCS (2019), Springer) · Zbl 07116682
[33] Mak, Andrlon; Amadini, Roberto, Aratha DSE tool (2019), available at
[34] Godefroid, P.; Klarlund, N.; Sen, K., DART: directed automated random testing, (Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI’05) (2005), ACM), 213-223
[35] Majumdar, R.; Sen, K., Hybrid concolic testing, (Proc. 29th Int. Conf. Software Engineering (ICSE 2007) (2007), IEEE), 416-426
[36] King, J. C., Symbolic execution and program testing, Commun. ACM, 19, 7, 385-394 (1976) · Zbl 0329.68018
[37] Blotsky, D.; Mora, F.; Berzish, M.; Zheng, Y.; Kabir, I.; Ganesh, V., Stringfuzz: a fuzzer for string solvers, (Computer Aided Verification - 30th International Conference. Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part II (2018)), 45-51
[38] de Moura, L. M.; Bjørner, N., Z3: an efficient SMT solver, (Tools and Algorithms for the Construction and Analysis of Systems. Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings (2008)), 337-340
[39] Abdulla, P. A.; Atig, M. F.; Chen, Y.; Holík, L.; Rezine, A.; Rümmer, P.; Stenman Norn, J., An SMT solver for string constraints, (CAV. CAV, LNCS, vol. 9206 (2015), Springer), 462-469
[40] Trinh, M.; Chu, D.; Jaffar, J., Model counting for recursively-defined strings, (Computer Aided Verification - 29th International Conference. Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part II (2017)), 399-418
[41] Abdulla, P. A.; Atig, M. F.; Chen, Y.; Diep, B. P.; Holík, L.; Rezine, A.; Rümmer, P., Flatten and conquer: a framework for efficient analysis of string constraints, (Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017 (2017)), 602-617
[42] Holík, L.; Janku, P.; Lin, A. W.; Rümmer, P.; Vojnar, T., String constraints with concatenation and transducers solved efficiently, PACMPL, 2, POPL, 4:1-4:32 (2018)
[43] Stuckey, P. J.; Feydy, T.; Schutt, A.; Tack, G.; Fischer, J., The minizinc challenge 2008-2013, AI Mag., 2, 55-60 (2014)
[44] Amadini, R.; Jordan, A.; Gange, G.; Gauthier, F.; Schachte, P.; Søndergaard, H.; Stuckey, P. J.; Zhang, C., Combining string abstract domains for JavaScript analysis: an evaluation, (Legay, A.; Margaria, T., Proc. 23rd Int. Conf. Tools and Algorithms for the Construction and Analysis of Systems, Part I. Proc. 23rd Int. Conf. Tools and Algorithms for the Construction and Analysis of Systems, Part I, LNCS, vol. 10205 (2017), Springer), 41-57
[45] Amadini, R.; Gange, G.; Gauthier, F.; Jordan, A.; Schachte, P.; Søndergaard, H.; Stuckey, P. J.; Zhang, C., Reference abstract domains and applications to string analysis, Fundam. Inform., 158, 4, 297-326 (2018) · Zbl 1386.68031
[46] Loring, B.; Mitchell, D.; Kinder, J., ExpoSE: practical symbolic execution of standalone JavaScript, (Proc. 24th ACM SIGSOFT Int. SPIN Symp. Model Checking of Software (SPIN’17) (2017), ACM Press), 196-199
[47] Thomé, J.; Shar, L. K.; Bianculli, D.; Briand, L. C., Search-driven string constraint solving for vulnerability detection, (ICSE 2017. ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017 (2017)), 198-208
[48] Yu, F.; Alkhalaf, M.; Bultan, T., Stranger: an automata-based string analysis tool for PHP, (TACAS. TACAS, LNCS, vol. 6015 (2010), Springer), 154-157
[49] Zheng, Y.; Ganesh, V.; Subramanian, S.; Tripp, O.; Dolby, J.; Zhang, X., Effective search-space pruning for solvers of string equations, regular expressions and length constraints, (Computer Aided Verification - 27th International Conference. Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I (2015)), 235-254
[50] Ganzinger, H.; Hagen, G.; Nieuwenhuis, R.; Oliveras, A.; Tinelli, C., Dpll(t): fast decision procedures, (Alur, R.; Peled, D. A., Computer Aided Verification: Proc. 16th Int. Conf.. Computer Aided Verification: Proc. 16th Int. Conf., LNCS, vol. 3114 (2004), Springer), 175-188 · Zbl 1103.68616
[51] Zheng, Y.; Zhang, X.; Ganesh, V., Z3-str: a Z3-based string solver for web application analysis, (Proc. 9th Joint Meeting on Foundations of Software Engineering (2013), ACM), 114-124
[52] Zheng, Y.; Ganesh, V.; Subramanian, S.; Tripp, O.; Dolby, J.; Zhang, X., Effective search-space pruning for solvers of string equations, regular expressions and length constraints, (CAV. CAV, LNCS, vol. 9206 (2015), Springer), 235-254
[53] Trinh, M.-T.; Chu, D.-H.; Jaffar, J., S3: a symbolic string solver for vulnerability detection in web applications, (Proc. 2014 ACM SIGSAC Conf. Computer and Communications Security (2014), ACM), 1232-1243
[54] Trinh, M.-T.; Chu, D.-H.; Jaffar, J., Progressive reasoning over recursively-defined strings, (Computer Aided Verification. Computer Aided Verification, Lecture Notes in Computer Science, vol. 9779 (2016), Springer), 218-240 · Zbl 1411.68031
[55] Abdulla, P. A.; Atig, M. F.; Chen, Y.-F.; Holík, L.; Rezine, A.; Rümmer, P.; Stenman, J., Norn: an SMT solver for string constraints, (Kroening, D.; Păsăreanu, C., Computer Aided Verification: Proc. 27th Int. Conf. Part I. Computer Aided Verification: Proc. 27th Int. Conf. Part I, LNCS, vol. 9206 (2015), Springer), 462-469
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.