WebGraph swMATH ID: 30097 Software Authors: Boldi, P.; Vigna, S. Description: WebGraph is a framework for graph compression aimed at studying web graphs. It provides simple ways to manage very large graphs, exploiting modern compression techniques. More precisely, it is currently made of: A set of flat codes, called ζ codes, which are particularly suitable for storing web graphs (or, in general, integers with power-law distribution in a certain exponent range). The fact that these codes work well can be easily tested empirically, but we also try to provide a detailed mathematical analysis. Algorithms for compressing web graphs that exploit gap compression and referentiation (à la LINK), intervalisation and ζ codes to provide a high compression ratio (see our datasets). The algorithms are controlled by several parameters, which provide different tradeoffs between access speed and compression ratio. Algorithms for accessing a compressed graph without actually decompressing it, using lazy techniques that delay the decompression until it is actually necessary. Algorithms for analysing very large graphs, such as HyperBall, which has been used to show that Facebook has just four degrees of separation. A complete, documented implementation of the algorithms above in Java distributed under the GNU General Public License. Besides a clearly defined API, we also provide several classes tha modify (e.g., transpose) or recompress a graph, so to experiment with various settings. Datasets for very large graph (e.g., a billion of links). These are either gathered from public sources (such as WebBase), or produced by UbiCrawler and BUbiNG. In the end, with WebGraph you can access and analyse very large web graphs. Using WebGraph is as easy as installing a few jar files and downloading a dataset. This makes studying phenomena such as PageRank, distribution of graph properties of the web graph, etc. very easy. Homepage: http://webgraph.di.unimi.it/ Related Software: UbiCrawler; SNAP; SparseMatrix; DIMACS; Pregel; GraphChi; Pajek datasets; Pajek; Ligra; igraph; BUbiNG; METIS; gSpan; PEGASUS; ANF; ILUPACK; KONECT; GraphFrames; NetworkX; EmptyHeaded Cited in: 53 Publications all top 5 Cited by 144 Authors 5 Navarro, Gonzalo 3 Bringmann, Karl 3 Carpentieri, Bruno 3 Keusch, Ralph 3 Lengler, Johannes 3 Litvak, Nelly 3 Shen, Zhaoli 3 Wen, Chun 2 Barbay, Jérémy 2 Claude, Francisco 2 Crespelle, Christophe 2 Gagie, Travis 2 Gu, Xian-Ming 2 Hager, William W. 2 Huang, Ting-Zhu 2 Khan, Kifayat Ullah 2 Ladra, Susana 2 Lee, Youngkoo 2 Nawaz, Waqas 2 Strash, Darren 2 Towsley, Donald Fred 2 van der Hoorn, Pim 1 Akiba, Takuya 1 Aldous, David John 1 Anh, Tu Nguyen 1 Arroyuelo, Diego 1 Avrachenkov, Konstantin Evgen’evich 1 Bessis, Nik 1 Bieniecki, Wojciech 1 Boldi, Paolo 1 Bonchi, Francesco 1 Brandes, Ulrik 1 Bressan, Marco 1 Broß, Jan 1 Chen, Chen 1 Chen, Yaoliang 1 Coimbra, Miguel E. 1 Crescenzi, Pierluigi 1 d’Aspremont, Alexandre 1 Davis, Timothy Alden 1 Dayar, Tugrul 1 de Bernardo, Guillermo 1 Dolgorsuren, Batjargal 1 El Karoui, Noureddine 1 Faloutsos, Christos 1 Ferdous, S. M. 1 Ferres, Leo 1 Firmani, Donatella 1 Fischer, Johannes 1 Francisco, Alexandre P. 1 Fuentes-Sepúlveda, José 1 Gambette, Philippe 1 García-Soriano, David 1 Gebremedhin, Assefaw Hadish 1 Georgiadis, Loukas 1 Glaria, Felipe 1 Gleich, David F. 1 Gog, Simon 1 Grabowski, Szymon 1 Grossi, Roberto 1 Guan, Xiaohong 1 Habib, Michel A. 1 Hamann, Michael 1 Hauck, Matthias 1 He, Meng 1 Hernández, Cecilia 1 Hrotkó, Joana 1 Huynh, The Dang 1 Italiano, Giuseppe Francesco 1 Iwata, Yoichi 1 Jordan, Michael Irwin 1 Kang, U. 1 Kolodziej, Scott P. 1 Koutra, Danai 1 Lagraa, Sofiane 1 Lamm, Sebastian 1 Lanzi, Leonardo 1 Latapy, Matthieu 1 Laura, Luigi 1 Le, Tien-Nam 1 Lei, Kai 1 Liang, Yuzhi 1 Lonati, Violetta 1 Lu, Yiqi 1 Lui, John C. S. 1 Lyu, Ziyu 1 Madduri, Kamesh 1 Mania, Horia 1 Manne, Fredrik 1 Marino, Andrea 1 Mathieu, Fabien 1 Maus, Yannic 1 Molla, Anisur Rahaman 1 Navlakha, Saket 1 Nekrich, Yakov 1 Noyan, Gökçe N. 1 Olvera-Cravioto, Mariana 1 Pan, Xinghao 1 Papailiopoulos, Dimitris S. 1 Paradies, Marcus ...and 44 more Authors all top 5 Cited in 30 Serials 5 Theoretical Computer Science 4 Information Sciences 4 Internet Mathematics 3 Algorithmica 3 Journal of Discrete Algorithms 2 Applied Mathematics and Computation 2 Journal of Computational and Applied Mathematics 2 Information and Computation 2 SIAM Journal on Scientific Computing 2 Data Mining and Knowledge Discovery 1 Computers & Mathematics with Applications 1 Discrete Applied Mathematics 1 Discrete Mathematics 1 ACM Transactions on Mathematical Software 1 Computing 1 Journal of Computer and System Sciences 1 Neural Computation 1 The Annals of Applied Probability 1 Computational Geometry 1 Journal of Global Optimization 1 Automation and Remote Control 1 Foundations of Computing and Decision Sciences 1 SIAM Journal on Optimization 1 ETNA. Electronic Transactions on Numerical Analysis 1 Journal of Heuristics 1 RAIRO. Theoretical Informatics and Applications 1 Probability in the Engineering and Informational Sciences 1 Acta Numerica 1 Statistical Analysis and Data Mining 1 Electronic Journal of Statistics all top 5 Cited in 11 Fields 39 Computer science (68-XX) 29 Combinatorics (05-XX) 8 Numerical analysis (65-XX) 7 Operations research, mathematical programming (90-XX) 4 Game theory, economics, finance, and other social and behavioral sciences (91-XX) 3 Linear and multilinear algebra; matrix theory (15-XX) 3 Probability theory and stochastic processes (60-XX) 3 Statistics (62-XX) 1 Biology and other natural sciences (92-XX) 1 Systems theory; control (93-XX) 1 Information and communication theory, circuits (94-XX) Citations by Year