Defining behaviorizeable relations to enable inference in semi-automatic program synthesis. (English) Zbl 1528.68081

Summary: Developing and using automatic program synthesis mechanisms within Domain Specific Languages (DSLs) requires transcribing empirical knowledge and specifications to formal models. Since the related expertise is time-consuming to build, software engineers are discouraged from adopting automated synthesis as a development paradigm. New approaches promise a significant reduction of human training effort by synthesizing programs from lax specifications, such as unstructured or semi-structured natural language. However, they do not introduce semantics of logical inference that would help users interact with the synthesis mechanism, for example to treat misinterpreted specifications. To cover this shortcoming, in this work we introduce the concept of behaviorizeable relations, which can be less rigorous than behavioral equivalence – and therefore allow laxer specifications – but still perform inference under implicit logical interpretations of the synthesis domain. Using these relations, we develop a generalized greedy framework that iteratively replaces parts of specifications with equivalent implementations; any behaviorizeable relation can be plugged in this framework to synthesize programs while respecting inference under a corresponding logical system. We finally perform a case study, where we define an example predicate-based behaviorizeable relation that lets our framework linearly combine blocks of Python code based on their comment similarity to natural language specifications. We use the resulting mechanism to demonstrate that implicit inference pertains to understanding how to adjust or state new specifications when lax behaviorizeable relations are employed.


68N30 Mathematical aspects of software engineering (specification, verification, metrics, requirements, etc.)
03B70 Logic in computer science
68Q60 Specification and verification (program logics, model checking, etc.)
Full Text: DOI


[1] Allamanis, M.; Barr, E. T.; Devanbu, P.; Sutton, C., A survey of machine learning for big code and naturalness, ACM Comput. Surv., 51, 4, 1-37 (2018)
[2] Schmidt, D. C., Model-driven engineering, IEEE Comput. Soc., 39, 2, 25 (2006)
[3] Gulwani, S., Dimensions in program synthesis, (Proceedings of the 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (2010), ACM), 13-24
[4] Voigtländer, J., Ideas for connecting inductive program synthesis and bidirectionalization, (Proceedings of the ACM SIGPLAN 2012 Workshop on Partial Evaluation and Program Manipulation (2012), ACM), 39-42
[5] Dillig, I.; Dillig, T.; Li, B.; McMillan, K.; Sagiv, M., Synthesis of circular compositional program proofs via abduction, Int. J. Softw. Tools Technol. Transf., 19, 5, 535-547 (2017)
[6] Osera, P.-M.; Zdancewic, S., Type-and-example-directed program synthesis, (ACM SIGPLAN Notices, vol. 50 (2015), ACM), 619-630
[7] Wang, C.; Cheung, A.; Bodik, R., Synthesizing highly expressive SQL queries from input-output examples, (Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (2017)), 452-466
[8] Alur, R.; Radhakrishna, A.; Udupa, A., Scaling enumerative program synthesis via divide and conquer, (International Conference on Tools and Algorithms for the Construction and Analysis of Systems (2017), Springer), 319-336 · Zbl 1452.68043
[9] Whittle, J.; Hutchinson, J.; Rouncefield, M.; Burden, H.; Heldal, R., A taxonomy of tool-related issues affecting the adoption of model-driven engineering, Softw. Syst. Model., 16, 2, 313-331 (2017)
[10] Akdur, D.; Garousi, V.; Demirörs, O., A survey on modeling and model-driven engineering practices in the embedded software industry, J. Syst. Archit., 91, 62-82 (2018)
[11] Shin, E. C.; Allamanis, M.; Brockschmidt, M.; Polozov, A., Program synthesis and semantic parsing with learned code idioms, (Advances in Neural Information Processing Systems (2019)), 10824-10834
[12] Desai, A.; Gulwani, S.; Hingorani, V.; Jain, N.; Karkare, A.; Marron, M.; Roy, S., Program synthesis using natural language, (Proceedings of the 38th International Conference on Software Engineering (2016), ACM), 345-356
[13] Allamanis, M.; Brockschmidt, M., Smartpaste: learning to adapt source code (2017), arXiv preprint
[14] Yaghmazadeh, N.; Wang, Y.; Dillig, I.; Dillig, T., Sqlizer: query synthesis from natural language, (Proceedings of the ACM on Programming Languages. Proceedings of the ACM on Programming Languages, OOPSLA, vol. 1 (2017)), 1-26
[15] Lara, J. D.; Guerra, E., A posteriori typing for model-driven engineering: concepts, analysis, and applications, ACM Trans. Softw. Eng. Methodol., 25, 4, 31 (2017)
[16] Guo, Z.; James, M.; Justo, D.; Zhou, J.; Wang, Z.; Jhala, R.; Polikarpova, N., Program synthesis by type-guided abstraction refinement, (Proceedings of the ACM on Programming Languages, POPL, vol. 4 (2019)), 1-28
[17] Amnell, T., Code synthesis for timed automata (2003), Uppsala Universitet, PhD thesis
[18] Mischkalla, F.; He, D.; Mueller, W., Closing the gap between uml-based modeling, simulation and synthesis of combined hw/sw systems, (Design, Automation & Test in Europe Conference & Exhibition. Design, Automation & Test in Europe Conference & Exhibition, DATE, 2010 (2010), IEEE), 1201-1206
[19] Ringert, J. O.; Roth, A.; Rumpe, B.; Wortmann, A., Code generator composition for model-driven engineering of robotics component & connector systems (2015), arXiv preprint
[20] Singh, R.; Gulwani, S., Predicting a correct program in programming by example, (International Conference on Computer Aided Verification (2015), Springer), 398-414
[21] Devlin, J.; Uesato, J.; Bhupatiraju, S.; Singh, R.; Mohamed, A.-r.; Kohli, P., Robustfill: neural program learning under noisy I/o (2017), arXiv preprint
[22] Alur, R.; Bodik, R.; Juniwal, G.; Martin, M. M.; Raghothaman, M.; Seshia, S. A.; Singh, R.; Solar-Lezama, A.; Torlak, E.; Udupa, A., Syntax-guided synthesis, (Formal Methods in Computer-Aided Design (FMCAD), 2013 (2013), IEEE), 1-8
[23] Gulwani, S.; Jha, S.; Tiwari, A.; Venkatesan, R., Synthesis of loop-free programs, ACM SIGPLAN Not., 46, 6, 62-73 (2011)
[24] Caulfield, B.; Rabe, M. N.; Seshia, S. A.; Tripakis, S., What’s decidable about syntax-guided synthesis? (2015), arXiv preprint
[25] Loncaric, C.; Torlak, E.; Ernst, M. D., Fast synthesis of fast collections, (Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (2016)), 355-368
[26] Fedyukovich, G.; Bodík, R., Accelerating syntax-guided invariant synthesis, (International Conference on Tools and Algorithms for the Construction and Analysis of Systems (2018), Springer), 251-269
[27] Reynolds, A.; Kuncak, V.; Tinelli, C.; Barrett, C.; Deters, M., Refutation-based synthesis in smt, Form. Methods Syst. Des., 55, 2, 73-102 (2019) · Zbl 1427.68051
[28] Reichel, H., Behavioural validity of conditional equations in abstract data types, Contrib. Gen. Algebra, 3, 21-24 (1984)
[29] Jha, S.; Gulwani, S.; Seshia, S. A.; Tiwari, A., Oracle-guided component-based program synthesis, (2010 ACM/IEEE 32nd International Conference on Software Engineering, vol. 1 (2010), IEEE), 215-224
[30] Srivastava, S.; Gulwani, S.; Foster, J. S., From Program Verification to Program Synthesis, ACM Sigplan Notices, vol. 45, 313-326 (2010), ACM · Zbl 1312.68068
[31] Beohar, H.; Mousavi, M. R., Input-output conformance testing based on featured transition systems, (Proceedings of the 29th Annual ACM Symposium on Applied Computing (2014), ACM), 1272-1278
[32] Langlois, B.; Jitia, C.-E.; Jouenne, E., DSL classification, (OOPSLA 7th Workshop on Domain Specific Modeling (2007))
[33] Tisi, M.; Cheng, Z., Coqtl: an internal DSL for model transformation in coq, (International Conference on Theory and Practice of Model Transformations (2018), Springer), 142-156
[34] Saunders, R. R.; Green, G., The nonequivalence of behavioral and mathematical equivalence, J. Exp. Anal. Behav., 57, 2, 227-241 (1992)
[35] Manna, Z.; Waldinger, R., Fundamentals of deductive program synthesis, IEEE Trans. Softw. Eng., 18, 8, 674-704 (1992)
[36] Giorgi, R., The State of the Art of Automatic Programming (2018)
[37] Gross, T. A.; Nartker, T. A., A classification of automatic program synthesis systems (1988), Nevada Univ Las Vegas Dept of Computer Science and Electrical Engineering, Tech. rep.
[38] Thummalapenta, S.; Sinha, S.; Singhania, N.; Chandra, S., Automating test automation, (2012 34th International Conference on Software Engineering. 2012 34th International Conference on Software Engineering, ICSE (2012), IEEE), 881-891
[39] Joulin, A.; Mikolov, T., Inferring algorithmic patterns with stack-augmented recurrent nets, (Advances in Neural Information Processing Systems (2015)), 190-198
[40] Zettlemoyer, L. S.; Collins, M., Learning context-dependent mappings from sentences to logical form, (Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 2 (2009), Association for Computational Linguistics), 976-984
[41] Liu, H.; Lieberman, H., Metafor: visualizing stories as code, (Proceedings of the 10th International Conference on Intelligent User Interfaces (2005), ACM), 305-307
[42] Li, F.; Jagadish, H., Constructing an interactive natural language interface for relational databases, Proc. VLDB Endow., 8, 1, 73-84 (2014)
[43] Dong, L.; Lapata, M., Language to logical form with neural attention (2016), arXiv preprint
[44] Lin, X. V.; Wang, C.; Pang, D.; Vu, K.; Ernst, M. D., Program synthesis from natural language using recurrent neural networks (2017), University of Washington Department of Computer Science and Engineering: University of Washington Department of Computer Science and Engineering Seattle, WA, USA, Tech. Rep. UW-CSE-17-03-01
[45] Bunel, R.; Hausknecht, M.; Devlin, J.; Singh, R.; Kohli, P., Leveraging grammar and reinforcement learning for neural program synthesis (2018), arXiv preprint
[46] Neelakantan, A.; Le, Q. V.; Sutskever, I., Neural programmer: inducing latent programs with gradient descent (2015), arXiv preprint
[47] Reed, S.; De Freitas, N., Neural programmer-interpreters (2015), arXiv preprint
[48] Alon, U.; Sadaka, R.; Levy, O.; Yahav, E., Structural language models for any-code generation (2019), arXiv preprint
[49] Dalal, D.; Galbraith, B. V., Evaluating sequence-to-sequence learning models for if-then program synthesis (2020), arXiv preprint
[50] Feng, Y.; Martins, R.; Van Geffen, J.; Dillig, I.; Chaudhuri, S., Component-based synthesis of table consolidation and transformation tasks from examples, ACM SIGPLAN Not., 52, 6, 422-436 (2017)
[51] Feng, Y.; Martins, R.; Bastani, O.; Dillig, I., Program synthesis using conflict-driven learning, ACM SIGPLAN Not., 53, 4, 420-435 (2018)
[52] Wang, X.; Dillig, I.; Singh, R., Program synthesis using abstraction refinement, (Proceedings of the ACM on Programming Languages, POPL, vol. 2 (2017)), 1-30
[53] Chen, Y.; Wang, C.; Bastani, O.; Dillig, I.; Feng, Y., Program synthesis using deduction-guided reinforcement learning, (International Conference on Computer Aided Verification (2020), Springer), 587-610 · Zbl 1478.68054
[54] Qi, Z.; Long, F.; Achour, S.; Rinard, M., An analysis of patch plausibility and correctness for generate-and-validate patch generation systems, (Proceedings of the 2015 International Symposium on Software Testing and Analysis (2015)), 24-36
[55] Long, F.; Rinard, M., Automatic patch generation by learning correct code, (Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (2016)), 298-312
[56] Bueno-Soler, J.; Carnielli, W. A., Possible-translations algebraization for paraconsistent logics, Bull. Sect. Log., 34, 2, 77-92 (2005) · Zbl 1117.03035
[57] Luo, L.; Ming, J.; Wu, D.; Liu, P.; Zhu, S., Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection, (Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (2014), ACM), 389-400
[58] Jacobs, S.; Kuncak, V., Towards complete reasoning about axiomatic specifications, (International Workshop on Verification, Model Checking, and Abstract Interpretation (2011), Springer), 278-293 · Zbl 1317.68117
[59] Ji, R.; Liang, J.; Xiong, Y.; Zhang, L.; Hu, Z., Question selection for interactive program synthesis, (PLDI (2020)), 1143-1158
[60] Awad, M.; Khanna, R., Support vector regression, (Efficient Learning Machines (2015), Springer), 67-80
[61] Huang, J.; Ling, C. X., Using AUC and accuracy in evaluating learning algorithms, IEEE Trans. Knowl. Data Eng., 17, 3, 299-310 (2005)
[62] Hosmer, D. W.; Lemeshow, S.; Sturdivant, R. X., Applied Logistic Regression, vol. 398 (2013), John Wiley & Sons · Zbl 1276.62050
[63] Judson, T., Abstract Algebra: Theory and Applications (2014), Stephen F. Austin State University
[64] Väänänen, J., Second-order logic and foundations of mathematics, Bull. Symb. Log., 7, 04, 504-520 (2001) · Zbl 1002.03013
[65] Bottou, L., Large-scale machine learning with stochastic gradient descent, (Proceedings of COMPSTAT’2010 (2010), Springer), 177-186 · Zbl 1436.68293
[66] Domingos, P., The role of Occam’s razor in knowledge discovery, Data Min. Knowl. Discov., 3, 4, 409-425 (1999)
[67] Bird, S.; Loper, E.; Klein, E., Natural Language Processing with Python (2009), O’reilly Media Inc. · Zbl 1187.68630
[68] Bahmani, B.; Chowdhury, A.; Goel, A., Fast incremental and personalized pagerank (2010), arXiv preprint
[69] Tong, H.; Faloutsos, C.; Pan, J.-Y., Fast random walk with restart and its applications, (Sixth International Conference on Data Mining. Sixth International Conference on Data Mining, ICDM’06 (2006), IEEE), 613-622
[70] Cormen, T. H.; Leiserson, C. E.; Rivest, R. L.; Stein, C., Introduction to Algorithms (2009), MIT Press · Zbl 1187.68679
[71] Rouveirol, C., Flattening and saturation: two representation changes for generalization, Mach. Learn., 14, 2, 219-232 (1994) · Zbl 0804.68024
[72] Figueira, S.; Gorín, D.; Grimson, R., On the formal semantics of if-like logics, J. Comput. Syst. Sci., 76, 5, 333-346 (2010) · Zbl 1197.03028
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.