GEDEVO: an evolutionary graph edit distance algorithm for biological network alignment. (English) Zbl 1281.92002

Beißbarth, Tim (ed.) et al., German conference on bioinformatics 2013, GCB’13, Göttingen, Germany, September 10–13, 2013. Selected papers based on the presentations at the conference. Wadern: Schloss Dagstuhl – Leibniz Zentrum für Informatik (ISBN 978-3-939897-59-0). OASIcs – OpenAccess Series in Informatics 34, 68-79, electronic only (2013).
Summary: With the so-called OMICS technology the scientific community has generated huge amounts of data that allow us to reconstruct the interplay of all kinds of biological entities. The emerging interaction networks are usually modeled as graphs with thousands of nodes and tens of thousands of edges between them. In addition to sequence alignment, the comparison of biological networks has proven great potential to infer the biological function of proteins and genes. However, the corresponding network alignment problem is computationally hard and theoretically intractable for real world instances.
We therefore developed GEDEVO, a novel tool for efficient graph comparison dedicated to real-world size biological networks. Underlying our approach is the so-called graph edit distance (GED) model, where one graph is to be transferred into another one, with a minimal number of (or more general: minimal costs for) edge insertions and deletions. We present a novel evolutionary algorithm aiming to minimize the GED, and we compare our implementation against state of the art tools: SPINAL, GHOST, C-GRAAL, and MI-GRAAL. On a set of protein-protein interaction networks from different organisms we demonstrate that GEDEVO outperforms the current methods. It thus refines the previously suggested alignments based on topological information only.
With GEDEVO, we account for the constantly exploding number and size of available biological networks. The software as well as all used data sets are publicly available at http://gedevo.mpi-inf.mpg.de.
For the entire collection see [Zbl 1279.92004].


92-04 Software, source code, etc. for problems pertaining to biology
92C42 Systems biology, networks
05C90 Applications of graph theory
90C59 Approximation methods and heuristics in mathematical programming
Full Text: DOI