Optimal structure identification with greedy search. (English) Zbl 1084.68519

Summary: We prove the so-called “Meek Conjecture”. In particular, we show that if a DAG \({\mathcal H}\) is an independence map of another DAG \({\mathcal G}\), then there exists a finite sequence of edge additions and covered edge reversals in \({\mathcal G}\) such that (1) after each edge modification \({\mathcal H}\) remains an independence map of \({\mathcal G}\) and (2) after all modifications \({\mathcal G}={\mathcal H}\). As shown by C. Meek [PhD thesis (1997)], this result has an important consequence for Bayesian approaches to learning Bayesian networks from data: in the limit of large sample size, there exists a two-phase greedy search algorithm that – when applied to a particular sparsely-connected search space – provably identifies a perfect map of the generative distribution if that perfect map is a DAG. We provide a new implementation of the search space, using equivalence classes as states, for which all operators used in the greedy search can be scored efficiently using local functions of the nodes in the domain. Finally, using both synthetic and real-world datasets, we demonstrate that the two-phase greedy approach leads to good solutions when learning with finite sample sizes.


68P10 Searching and sorting
68T05 Learning and adaptive systems in artificial intelligence
68T35 Theory of languages and software systems (knowledge-based systems, expert systems, etc.) for artificial intelligence


meek conjecture
Full Text: DOI