Bayesian networks in R. With applications in systems biology.

*(English)*Zbl 1272.62005
Use R! 48. New York, NY: Springer (ISBN 978-1-4614-6445-7/pbk; 978-1-4614-6446-4/ebook). xiii, 157 p. (2013).

The complexity of real life system and the frequent failure of reductionist representations to model known and novel associations led to the rise of Bayesian analysis. Regardless whether the analysed datasets include temporal information or not, the graphical models (e.g. Bayesian networks) have an advantage over systematic measures such as correlations that come from the representation of joint probability distribution between all the entities of interest.

The book describes the theoretical concepts related to Bayesian networks in a gradual manner providing also examples from the rapidly growing field of high throughput analyses. To facilitate the understanding of both theory and R functions, exercises are proposed and the solutions are also provided.

The book is structured in five chapters. The first chapter is focused on presenting a brief introduction to graph theory and a minimal set of functions required for the work in the R environment. The second chapter commences with the essential definitions and properties of Bayesian networks. Next, static Bayesian networks modelling is discussed. Learning algorithms with constraint-based, score-based and hybrid structure are presented in detail. The choice of distributions, the conditional tests and network scores are also exemplified using biological examples (gene expression profiles). The chapter concludes with the description of the corresponding R functions and details on how to plot network structures and on structure and parameter learning. In the third chapter the Bayesian networks in the presence of temporal information are presented. The chapter commences with concepts on time series and vector auto-regressive processes, followed by essential definitions and properties of dynamic Bayesian networks. Next, the learning algorithms are discussed and examples on multivariate time series are presented in detail. The forth chapter focuses on inference algorithms in static and dynamic Bayesian networks emphasising the notions presented in Chapters 2 and 3. The book concludes with parallelization options aimed at overcoming the computational limitations of Bayesian networks on large high-dimensional datasets. The foundations of parallel programming and the corresponding R functions are briefly presented. These are followed by applications to the structure and parameter learning and inference procedures.

The book describes the theoretical concepts related to Bayesian networks in a gradual manner providing also examples from the rapidly growing field of high throughput analyses. To facilitate the understanding of both theory and R functions, exercises are proposed and the solutions are also provided.

The book is structured in five chapters. The first chapter is focused on presenting a brief introduction to graph theory and a minimal set of functions required for the work in the R environment. The second chapter commences with the essential definitions and properties of Bayesian networks. Next, static Bayesian networks modelling is discussed. Learning algorithms with constraint-based, score-based and hybrid structure are presented in detail. The choice of distributions, the conditional tests and network scores are also exemplified using biological examples (gene expression profiles). The chapter concludes with the description of the corresponding R functions and details on how to plot network structures and on structure and parameter learning. In the third chapter the Bayesian networks in the presence of temporal information are presented. The chapter commences with concepts on time series and vector auto-regressive processes, followed by essential definitions and properties of dynamic Bayesian networks. Next, the learning algorithms are discussed and examples on multivariate time series are presented in detail. The forth chapter focuses on inference algorithms in static and dynamic Bayesian networks emphasising the notions presented in Chapters 2 and 3. The book concludes with parallelization options aimed at overcoming the computational limitations of Bayesian networks on large high-dimensional datasets. The foundations of parallel programming and the corresponding R functions are briefly presented. These are followed by applications to the structure and parameter learning and inference procedures.

Reviewer: Irina Ioana Mohorianu (Norwich)

##### MSC:

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

62F15 | Bayesian inference |

68T35 | Theory of languages and software systems (knowledge-based systems, expert systems, etc.) for artificial intelligence |

62-04 | Software, source code, etc. for problems pertaining to statistics |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

92-02 | Research exposition (monographs, survey articles) pertaining to biology |

92C42 | Systems biology, networks |

05C90 | Applications of graph theory |