×

GiVip: a visual profiler for distributed graph processing systems. (English) Zbl 07026993

Frati, Fabrizio (ed.) et al., Graph drawing and network visualization. 25th international symposium, GD 2017, Boston, MA, USA, September 25–27, 2017. Revised selected papers. Cham: Springer. Lect. Notes Comput. Sci. 10692, 256-271 (2018).
Summary: Analyzing large-scale graphs provides valuable insights in different application scenarios. While many graph processing systems working on top of distributed infrastructures have been proposed to deal with big graphs, the tasks of profiling and debugging their massive computations remain time consuming and error-prone. This paper presents GiViP, a visual profiler for distributed graph processing systems based on a Pregel-like computation model. GiViP captures the huge amount of messages exchanged throughout a computation and provides an interactive user interface for the visual analysis of the collected data. We show how to take advantage of GiViP to detect anomalies related to the computation and to the infrastructure, such as slow computing units and anomalous message patterns.
For the entire collection see [Zbl 1381.68007].

MSC:

68R10 Graph theory (including graph drawing) in computer science
68U05 Computer graphics; computational geometry (digital and algorithmic aspects)
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] http://hadoop.apache.org/. Accessed 10 June 2017 · Zbl 1205.94072
[2] https://spark.apache.org/. Accessed 10 June 2017
[3] http://www.circos.ca. Accessed 10 June 2017
[4] Hpc toolkit (2011). http://hpctoolkit.org/index.html Accessed 22 Aug 2017 · Zbl 1281.94071
[5] Archambault, D.; Purchase, HC; Pinaud, B., Animation, small multiples, and the effect of mental map preservation in dynamic graphs, IEEE Trans. Vis. Comput. Graph., 17, 4, 539-552 (2011) · Zbl 1398.94112
[6] Argyriou, E.N., Symvonis, A., Vassiliou, V.: A fraud detection visualization system utilizing radial drawings and heat-maps. In: Laramee, R.S., Kerren, A., Braz, J. (eds.) IVAPP 2014, pp. 153-160. SciTePress (2014) · Zbl 1177.94181
[7] Arleo, A., Didimo, W., Liotta, G., Montecchiani, F.: GiViP: a visual profiler for distributed graph processing systems. ArXiv e-prints http://arxiv.org/abs/1708.07985 (2017) · Zbl 07026993
[8] Arleo, A.; Didimo, W.; Liotta, G.; Montecchiani, F.; Hu, Y.; Nöllenburg, M., A distributed multilevel force-directed algorithm, Graph Drawing and Network Visualization, 3-17 (2016), Cham: Springer, Cham · Zbl 1446.94105
[9] Arleo, A.; Didimo, W.; Liotta, G.; Montecchiani, F., Large graph visualizations using a distributed computing platform, Inf. Sci., 381, 124-141 (2017)
[10] Baur, M.; Brandes, U.; Hromkovič, J.; Nagl, M.; Westfechtel, B., Crossing reduction in circular layouts, Graph-Theoretic Concepts in Computer Science, 332-343 (2004), Heidelberg: Springer, Heidelberg · Zbl 0435.94018
[11] Beck, F.; Burch, M.; Diehl, S.; Weiskopf, D., A taxonomy and survey of dynamic graph visualization, Comput. Graph. Forum, 36, 1, 133-159 (2017) · Zbl 1317.94103
[12] Behrisch, M.; Bach, B.; Hund, M.; Delz, M.; von Rüden, L.; Fekete, J.; Schreck, T., Magnostics: image-based search of interesting matrix views for guided network exploration, IEEE Trans. Vis. Comput. Graph., 23, 1, 31-40 (2017)
[13] Bostock, M.; Ogievetsky, V.; Heer, J., D \({^3}\) data-driven documents, IEEE Trans. Vis. Comput. Graph., 17, 12, 2301-2309 (2011) · Zbl 1231.68124
[14] Braun, B., Qin, H.: ddtrace: rich performance monitoring in distributed systems · Zbl 1294.94050
[15] Bruls, M., Huizing, K., van Wijk, J.J.: Squarified treemaps. In: de Leeuw, W.C., van Liere, R. (eds.) IEEE TCVG 2000. pp. 33-42. Eurographics Association (2000) · Zbl 0644.94012
[16] Burch, M.; Vehlow, C.; Beck, F.; Diehl, S.; Weiskopf, D., Parallel edge splatting for scalable dynamic graph visualization, IEEE Trans. Vis. Comput. Graph., 17, 12, 2344-2353 (2011)
[17] Byron, L.; Wattenberg, M., Stacked graphs - geometry & aesthetics, IEEE Trans. Vis. Comput. Graph., 14, 6, 1245-1252 (2008)
[18] CERN: Hadoop profiler (2016). https://github.com/cerndb/Hadoop-Profiler. Accessed 10 June 2017 · Zbl 1217.94097
[19] Ching, A.; Edunov, S.; Kabiljo, M.; Logothetis, D.; Muthukrishnan, S., One trillion edges: graph processing at Facebook-scale, PVLDB, 8, 12, 1804-1815 (2015) · Zbl 1039.94525
[20] Cohen, J., Graph twiddling in a mapreduce world, Comput. Sci. Eng., 11, 4, 29-41 (2009) · Zbl 1067.94538
[21] Crnovrsanin, T.; Chu, J.; Ma, K., An incremental layout method for visualizing online dynamic graphs, J. Graph Algorithms Appl., 21, 1, 55-80 (2017) · Zbl 1358.05261
[22] Dehkordi, HR; Eades, P.; Hong, S.; Nguyen, QH, Circular right-angle crossing drawings in linear time, Theor. Comput. Sci., 639, 26-41 (2016) · Zbl 1344.68176
[23] Di Battista, G.; Eades, P.; Tamassia, R.; Tollis, IG, Graph Drawing: Algorithms for the Visualization of Graphs (1999), Englewood Cliffs: Prentice-Hall, Englewood Cliffs · Zbl 1057.68653
[24] Dogrusoz, U.; Belviranli, ME; Dilek, A., CiSE: a circular spring embedder layout algorithm, IEEE Trans. Vis. Comput. Graph., 19, 6, 953-966 (2013)
[25] Elmqvist, N., Do, T.N., Goodell, H., Henry, N., Fekete, J.D.: ZAME: Interactive large-scale graph visualization. In: IEEE PacificVis 2008, pp. 215-222 (2008) · Zbl 1386.94094
[26] Elmqvist, N.; Fekete, JD, Hierarchical aggregation for information visualization: overview, techniques, and design guidelines, IEEE Trans. Vis. Comput. Graph., 16, 3, 439-454 (2010)
[27] Frishman, Y.; Tal, A., Online dynamic graph drawing, IEEE Trans. Vis. Comput. Graph., 14, 4, 727-740 (2008)
[28] Fuchs, J., Fischer, F., Mansmann, F., Bertini, E., Isenberg, P.: Evaluation of alternative glyph designs for time series data in a small multiple setting. In: Mackay, W.E., Brewster, S.A., Bødker, S. (eds.) 2013 ACM SIGCHI, pp. 3237-3246. ACM (2013)
[29] Gabrielli, L.; Rinzivillo, S.; Ronzano, F.; Villatoro, D.; Nin, J.; Villatoro, D., From Tweets to semantic trajectories: mining anomalous urban mobility patterns, Citizen in Sensor Networks, 26-35 (2014), Cham: Springer, Cham · Zbl 0881.94015
[30] Graham, SL; Kessler, PB; McKusick, MK, Gprof: a call graph execution profiler, ACM SIGPLAN Not., 39, 4, 49-57 (2004)
[31] Gulzar, M.A., Interlandi, M., Yoo, S., Tetali, S.D., Condie, T., Millstein, T.D., Kim, M.: BigDebug: debugging primitives for interactive big data processing in spark. In: ICSE 2016, pp. 784-795. ACM (2016) · Zbl 1028.94511
[32] Havre, S., Hetzler, B., Nowell, L.: Themeriver: visualizing theme changes over time. In: IEEE InfoVis 2000, pp. 115-123. IEEE (2000) · Zbl 1295.94111
[33] Heer, J.; Bostock, M.; Ogievetsky, V., A tour through the visualization zoo, Commun. ACM, 53, 6, 59-67 (2010) · Zbl 1279.94099
[34] Henry, N.; Fekete, J.; McGuffin, MJ, NodeTrix: a hybrid visualization of social networks, IEEE Trans. Vis. Comput. Graph., 13, 6, 1302-1309 (2007)
[35] Hochheiser, H.; Shneiderman, B., Dynamic query tools for time series data sets: timebox widgets for interactive exploration, Inform. Vis., 3, 1, 1-18 (2004) · Zbl 1215.94060
[36] Holten, D., Hierarchical edge bundles: visualization of adjacency relations in hierarchical data, IEEE Trans. Vis. Comput. Graph., 12, 5, 741-748 (2006) · Zbl 1419.94043
[37] Jackson, J.: Facebook’s graph search puts Apache Giraph on the map (2013). http://www.pcworld.com/article/2046680/facebooks-graph-search-puts-apache-giraph-on-the-map.html/. Accessed 10 June 2017
[38] Javed, W.; McDonnel, B.; Elmqvist, N., Graphical perception of multiple time series, IEEE Trans. Vis. Comput. Graph., 16, 6, 927-934 (2010) · Zbl 1122.68448
[39] Johnson, A.: Introducing statsd-jvm-profiler: a JVM profiler for hadoop (2015). https://github.com/cerndb/Hadoop-Profiler. Accessed 10 June 2017
[40] Krstajic, M.; Bertini, E.; Keim, D., CloudLines: compact display of event episodes in multiple time-series, IEEE Trans. Vis. Comput. Graph., 17, 12, 2432-2439 (2011)
[41] Krzywinski, M.; Schein, J.; Birol, I.; Connors, J.; Gascoyne, R.; Horsman, D.; Jones, SJ; Marra, MA, Circos: an information aesthetic for comparative genomics, Genome Res., 19, 9, 1639-1645 (2009) · Zbl 1359.94626
[42] Lumsdaine, A.; Gregor, DP; Hendrickson, B.; Berry, JW, Challenges in parallel graph processing, Parallel Process. Lett., 17, 1, 5-20 (2007)
[43] Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: ACM SIGMOD 2010, pp. 135-146. ACM (2010) · Zbl 1281.94057
[44] Masuda, S., Kashiwabara, T., Nakajima, K., Fujisawa, T.: On the NP-completeness of a computer network layout problem. In: IEEE International Symposium on Circuits and Systems, pp. 292-295 (1987)
[45] McCune, RR; Weninger, T.; Madey, G., Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing, ACM Comput. Surv., 48, 2, 25:1-25:39 (2015)
[46] McLachlan, P., Munzner, T., Koutsofios, E., North, S.: LiveRAC: interactive visual exploration of system management time-series data. In: 2008 ACM SIGCHI, pp. 1483-1492. ACM (2008)
[47] Plaisant, C., Milash, B., Rose, A., Widoff, S., Shneiderman, B.: LifeLines: visualizing personal histories. In: 1996 ACM SIGCHI, pp. 221-227. ACM (1996)
[48] Playfair, W.: The Commercial and Political Atlas: Representing, by Means of Stained Copper-plate Charts, the Progress of the Commerce, Revenues, Expenditure and Debts of England During the Whole of the Eighteenth Century. Printed by T. Burton for J. Wallis, etc; 3rd edn. (1801)
[49] Purchase, HC, Effective information visualisation: a study of graph drawing aesthetics and algorithms, Interact. Comput., 13, 2, 147-162 (2000)
[50] Purchase, HC; Carrington, DA; Allder, JA, Empirical evaluation of aesthetics-based graph layout, Empirical Softw. Eng., 7, 3, 233-255 (2002) · Zbl 1010.68636
[51] Purchase, HC; Hoggan, E.; Görg, C.; Kaufmann, M.; Wagner, D., How important is the “Mental Map”? - an empirical investigation of a dynamic graph layout algorithm, Graph Drawing, 184-195 (2007), Heidelberg: Springer, Heidelberg · Zbl 1185.68494
[52] Reinders, J.: VTune Performance Analyzer Essentials. Intel Press (2005)
[53] Saito, T., Miyamura, H.N., Yamamoto, M., Saito, H., Hoshiya, Y., Kaseda, T.: Two-tone pseudo coloring: Compact visualization for one-dimensional data. In: 2005 IEEE InfoVis, pp. 173-180. IEEE (2005)
[54] Salihoglu, S., Shin, J., Khanna, V., Truong, B.Q., Widom, J.: Graft: a debugging tool for Apache Giraph. In: ACM SIGMOD 2015, pp. 1403-1408. ACM (2015)
[55] Salihoglu, S., Widom, J.: GPS: a graph processing system. In: SSDBM 2013, pp. 22:1-22:12. ACM (2013)
[56] Seo, S., Yoon, E.J., Kim, J., Jin, S., Kim, J., Maeng, S.: HAMA: an efficient matrix computation with the mapreduce framework. In: CloudCom 2010, pp. 721-726. IEEE (2010)
[57] Six, JM; Tollis, IG; Kratochvíyl, J., A framework for circular drawings of networks, Graph Drawing, 107-116 (1999), Heidelberg: Springer, Heidelberg
[58] Stitz, H.; Gratzl, S.; Aigner, W.; Streit, M., ThermalPlot: visualizing multi-attribute time-series data using a thermal metaphor, IEEE Trans. Vis. Comput. Graph., 22, 12, 2594-2607 (2016)
[59] Stitz, H., Gratzl, S., Krieger, M., Streit, M.: CloudGazer: a divide-and-conquer approach to monitoring and optimizing cloud-based networks. In: IEEE PacificVis 2015, pp. 175-182. IEEE (2015)
[60] Tang, J.: Graph mining with Apache Giraph (2013). https://www.slideshare.net/Hadoop_Summit/tang-june26-205pmroom210cv2, Accessed 10 June 2017
[61] Tufte, E., The Visual Display of Quantitative Information (1983), Cheshire: Graphics Press, Cheshire
[62] Valiant, LG, A bridging model for parallel computation, Commun. ACM, 33, 8, 103-111 (1990)
[63] Vaquero, L.M., Cuadrado, F., Logothetis, D., Martella, C.: Adaptive partitioning for large-scale dynamic graphs. In: IEEE ICDCS 2014, pp. 144-153. IEEE (2014)
[64] Ward, MO, Multivariate data glyphs: principles and practice, Handbook of Data Visualization (2008), Heidelberg: Springer, Heidelberg · Zbl 1145.68388
[65] Ware, C.; Purchase, HC; Colpoys, L.; McGill, M., Cognitive measurements of graph aesthetics, Inform. Vis., 1, 2, 103-110 (2002)
[66] Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, p. 2. USENIX Association (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.