×

Algebraic statistics and contingency table problems: log-linear models, likelihood estimatio, and disclosure limitation. (English) Zbl 1166.13033

Putinar, Mihai (ed.) et al., Emerging applications of algebraic geometry. Papers of the IMA workshops Optimization and control, January 16–20, 2007 and Applications in biology, dynamics, and statistics, March 5–9, 2007, held at IMA, Minneapolis, MN, USA. New York, NY: Springer (ISBN 978-0-387-09685-8/hbk). The IMA Volumes in Mathematics and its Applications 149, 63-88 (2009).
Several areas of statistics have benefited from a novel approach based on tools made available by computational algebraic geometry. These developments are generally referred to as algebraic statistics. The analysis of categorical data is one of the domains on which the new formalism had a strong impact.
In the paper under review one finds a survey of two classes of contingency table problems: maximum likelihood estimation for log-linear models and disclosure limitation. The relevant concepts from statistics are clearly defined and the corresponding geometric objects or algebraic notions are pointed out. Results that put on a firm theoretical basis the link between the two problems are quoted. The risk of identification of individuals associated with counts in contingency tables is assessed in various ways, among which the computation of bounds for cell entries and the enumeration of possible table realizations. Until recently, the recipe was to solve a linear programming problem via the simplex method. In the framework of algebraic statistics, both tasks are attained by various tools for determining Gröbner and Markov bases for suitable systems of polynomial equations. Advantages and disadvantages of the two kinds of computations are clearly stated. The rigorous treatment is complemented by a series of enlightening examples that illustrate the computational complexity of solutions available to date. In a final section, the authors present seven open problems related to the topics of the article. All of them are challenging from both theoretical and computational point of view.
This very well written paper is useful to a broad audience from computational algebraic geometry and statistics.
For the entire collection see [Zbl 1151.14004].

MSC:

13P10 Gröbner bases; other bases for ideals and modules (e.g., Janet and border bases)
62B05 Sufficient statistics and fields
62H17 Contingency tables
62P25 Applications of statistics to social sciences
PDFBibTeX XMLCite