zbMATH — the first resource for mathematics

Using formal verification to evaluate the execution time of Spark applications. (English) Zbl 1451.68166
Summary: Apache Spark is probably the most widely adopted framework for developing big-data batch applications and for executing them on a cluster of (virtual) machines. In general, the more resources (machines) one uses, the faster applications execute, but there is currently no adequate means to determine the proper size of a Spark cluster given time constraints, or to foresee execution times given the number of employed machines. One can only run these applications and use her/his experience to size the cluster and predict expected execution times. Wrong estimation of execution times can lead to costly overruns and overly long executions, thus calling for analytic sizing/prediction techniques that provide precise time guarantees. This paper addresses this problem by proposing a solution based on model-checking. The approach exploits a directed acyclic graph (DAG) to abstract the structure of the execution flows of Spark programs, annotates each node (Spark stage) with execution-related data, and formulates the identification of the global execution time as a reachability problem. To avoid the well-known state space explosion problem, the paper also proposes a technique to reduce the size of generated abstract models. This results in a significant decrease in used memory and/or verification time making our approach feasible for predicting the execution time of Spark applications given the resources available. The benefits of the proposed reduction technique are evaluated by using both timed automata and constraint LTL over clocks logic to formally encode and analyze generated models. The approach is also successfully validated on some realistic case studies. Since the optimization is not Spark-specific, we claim that it can be applied to a wide range of applications whose underlying model can be abstracted as a DAG.
68Q60 Specification and verification (program logics, model checking, etc.)
GitHub; Kronos; NuSMV; SPIN; Uppaal
Full Text: DOI
[1] Alur, R.; Courcoubetis, C.; Dill, D., Model-checking in dense real-time, Inf Comput, 104, 1, 2-34 (1993) · Zbl 0783.68076
[2] Alur, R.; Dill, D., A theory of timed automata, Theor Comput Sci, 126, 2, 183-235 (1994) · Zbl 0803.68071
[3] Brito A, Ardagna D, Blanquer I, Evangelinou A, Barbierato E, Gribaudo M, Almeida J, Couto AP, Braga T (2017) D3.4 EUBra-BIGSEA QoS infrastructure services intermediate version. Technical report, EUBra-BIGSEA consortium
[4] Biere, A.; Cimatti, A.; Clarke, EC; Strichman, O.; Zhu, Y., Bounded model checking. Adv Comput, 58, 118-149 (2003)
[5] Behrmann G, David A, Larsen KG, Hakansson J, Petterson P, Yi W, Hendriks M (2006) Uppaal 4.0. In: Proceedings of the 3rd international conference on the quantitative evaluation of systems, QEST ’06, Washington, DC, USA. IEEE Computer Society, pp 125-126
[6] Bozga, M.; Daws, C.; Maler, O.; Olivero, A.; Tripakis, S.; Yovine, S.; Ravn, AP; Rischel, H., Kronos: a model-checking tool for real-time systems, Formal techniques in real-time and fault-tolerant systems, 298-302 (1998), Berlin: Springer, Berlin
[7] Baresi, L.; Ghezzi, C.; Mottola, L., Loupe: verifying publish-subscribe architectures with a magnifying lens, IEEE Trans Softw Eng, 37, 2, 228-246 (2011)
[8] Bradley S, Henderson W, Kendall D (1999) Using timed automata for response time analysis of distributed real-time systems. In: 24th IFAC/IFIP workshop on real-time programming, pp 143-148
[9] Behrmann, G.; Larsen, KG; Rasmussen, JI, Optimal scheduling using priced timed automata, SIGMETRICS Perform Eval Rev, 32, 4, 34-40 (2005)
[10] Bouyer P (2009) Model-checking timed temporal logics. Electron Notes Theor Comput Sci 231:323-341. Proceedings of the 5th workshop on methods for modalities (M4M5 2007) · Zbl 1347.68218
[11] Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the international world-wide web conference (WWW), pp 107-117
[12] Baresi L, Pourhashem Kallehbasti MM, Rossi M (2016) How bit-vector logic can help improve the verification of LTL specifications over infinite domains. In: Proceedings of the 31st annual ACM symposium on applied computing, pp 1666-1673
[13] Baresi L, Quattrocchi G (2018) Towards vertically scalable spark applications. In: Euro-Par 2018: parallel processing workshops. Springer
[14] Bersani MM, Rossi M, San Pietro P (2017) A logical characterization of timed regular languages. Theor Comput Sci 658:46-59 · Zbl 1355.68150
[15] Bersani, MM; Rossi, M.; San Pietro, P., A tool for deciding the satisfiability of continuous-time metric temporal logic, Acta Informatica, 53, 2, 171-206 (2016) · Zbl 1336.68230
[16] Burges, CJC, A tutorial on support vector machines for pattern recognition, Data Min Knowl Discov, 2, 2, 121-167 (1998)
[17] Cimatti A, Clarke EM, Giunchiglia E, Giunchiglia F, Pistore M, Roveri M, Sebastiani R, Tacchella A (2002) Nusmv 2: an opensource tool for symbolic model checking. In: Computer aided verification, 14th international conference, CAV 2002, Copenhagen, Denmark, 27-31 July 2002, Proceedings, pp 359-364 · Zbl 1010.68766
[18] Clarke EC, Emerson EA, Jha S, Prasad Sistla A (1998) Symmetry reductions in model checking. In: Hu AJ, Vardi MY (eds) Computer aided verification, 10th international conference, CAV ’98, Vancouver, BC, Canada, 28 June-2 July 1998, Proceedings. Lecture notes in computer science, vol 1427. Springer, pp 147-158
[19] Clarke, EM; Grumberg, O.; Peled, DA, Model checking (1999), Cambridge: MIT Press, Cambridge
[20] Corbett JC (Jul 1996) Timing analysis of ada tasking programs. IEEE Trans Softw Eng 22(7):461-483
[21] DAG-ver Project repository. github.com/deib-polimi/DAG-ver, 2019
[22] Demri, S.; D’Souza, D., An automata-theoretic approach to constraint LTL, Inf Comput, 205, 3, 380-415 (2007) · Zbl 1113.03015
[23] Dilworth, RP, A decomposition theorem for partially ordered sets, Ann Math, 51, 1, 161-166 (1950) · Zbl 0038.02003
[24] Donaldson AF, Miller A, Parker D (2009) Language-level symmetry reduction for probabilistic model checking. In: QEST 2009, sixth international conference on the quantitative evaluation of systems, Budapest, Hungary, 13-16 Sept 2009. IEEE Computer Society, pp 289-298
[25] Finkel, A.; Schnoebelen, P., Well-structured transition systems everywhere!, Theor Comput Sci, 256, 1, 63-92 (2001) · Zbl 0973.68170
[26] Godefroid P (1996) Partial-order methods for the verification of concurrent systems: an approach to the state-explosion problem. Lecture notes in computer science, vol 1032. Springer, Berlin · Zbl 1293.68005
[27] Gianniti, E.; Rizzi, AM; Barbierato, E.; Gribaudo, M.; Ardagna, D., Fluid petri nets for the performance evaluation of mapreduce and Spark applications, SIGMETRICS Perform Eval Rev, 44, 23-36 (2017)
[28] Hazewinkel M (1987) Encyclopaedia of mathematics (1). Encyclopaedia of mathematics: an updated and annotated translation of the soviet “Mathematical Encyclopaedia”. Springer, Berlin · Zbl 0806.00009
[29] Henzinger, TA, The theory of hybrid automata, 265-292 (2000), Berlin: Springer, Berlin · Zbl 0959.68073
[30] Holzmann, GJ, The model checker SPIN, IEEE Trans Softw Eng, 23, 5, 279-295 (1997)
[31] Ikiz S, Garg VK (2004) Online algorithms for Dilworth’s chain partition. Technical report, Parallel and Distributed Systems Laboratory, Department of Electrical and Computer Engineering, University of Texas at Austin
[32] Jang K, Sherry J, Ballani H, Moncaster T (2015) Silo: predictable message latency in the cloud. In: Proceedings of the 2015 ACM conference on special interest group on data communication, SIGCOMM 2015, London, UK, 17-21 Aug 2015, pp 435-448
[33] Kc K, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: Proceedings of the IEEE 2nd international conference on cloud computing technology and science. IEEE
[34] Krakora, J.; Waszniowski, L.; Pisa, P.; Hanzalek, Z., Timed automata approach to real time distributed system verification, Proceedings of the IEEE international workshop on factory communication systems, 2004, 407-410 (2004)
[35] Li S, Hu S, Wang S, Su L, Abdelzaher T, Gupta I, Pace R (2014) WOHA: deadline-aware map-reduce workflow scheduling framework over hadoop clusters. In: Proceedings of the IEEE 34th international conference on distributed computing systems. IEEE
[36] Bersani, MM; Frigeri, A.; Morzenti, A.; Pradella, M.; Rossi, M.; San Pietro, P., Constraint LTL satisfiability checking without automata, J Appl Log, 12, 4, 522-557 (2014) · Zbl 1310.68141
[37] MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Le Cam LM, Neyman J (eds) Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, pp 281-297 · Zbl 0214.46201
[38] Miller, A.; Donaldson, AF; Calder, M., Symmetry in temporal logic model checking, ACM Comput Surv, 38, 3, 8 (2006)
[39] Marconi F, Quattrocchi G, Baresi L, Bersani MM, Rossi M (2018) On the timed analysis of big-data applications. In: Dutle A, Muñoz CA, Narkawicz A (eds) NASA formal methods—10th international symposium, NFM 2018, Newport News, VA, USA, 17-19 Apr 2018, Proceedings. Lecture notes in computer science, vol 10811. Springer, pp 315-332
[40] Ousterhout K, Rasti R, Ratnasamy S, Shenker S, Chun B (2015) Making sense of performance in data analytics frameworks. In: Proceedings of the 12th USENIX conference on networked systems design and implementation. USENIX
[41] Perez, D.; Bernardi, S.; Merseguer, J.; JoJ, Requeno; Casale, G.; Zhu, L., DICE simulation tools: final version (2017), DICE consortium: Deliverable, DICE consortium
[42] Palencia JC, Gonzalez Harbour M (1998) Schedulability analysis for tasks with static and dynamic offsets. In: Proceedings of the IEEE real-time systems symposium, pp 26-37
[43] Politecnico di Milano (2019) The Zot Bounded Model/Satisfiability Checker. github.com/fm-polimi/zot
[44] Wang, F., Efficient verification of timed automata with bdd-like data structures, Int J Softw Tools Technol Transf, 6, 1, 77-97 (2004)
[45] Waez, MTB; Dingel, J.; Rudie, K., A survey of timed automata for the development of real-time systems, Comput Sci Rev, 9, 1-26 (2013) · Zbl 1302.68180
[46] Yasmina Abdeddaïm, Y.; Asarin, E.; Maler, O., Scheduling with timed automata, Theor Comput Sci, 354, 2, 272-300 (2006) · Zbl 1088.68023
[47] Yu J, Chen H, Hu F (2015) SASM: improving Spark performance with adaptive skew mitigation. In: 2015 IEEE international conference on progress in informatics and computing (PIC)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.