High-dimensional graphs and variable selection with the Lasso. (English) Zbl 1113.62082

Summary: The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models.
We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power.


62H99 Multivariate analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
05C90 Applications of graph theory
62F12 Asymptotic properties of parametric estimators
62H12 Estimation in multivariate analysis
65C60 Computational problems in statistics (MSC2010)
62J05 Linear regression; mixed models


PDCO; lars; MIM
Full Text: DOI arXiv


[1] Buhl, S. (1993). On the existence of maximum-likelihood estimators for graphical Gaussian models. Scand. J. Statist. 20 263–270. · Zbl 0778.62046
[2] Chen, S., Donoho, D. and Saunders, M. (2001). Atomic decomposition by basis pursuit. SIAM Rev. 43 129–159. JSTOR: · Zbl 0979.94010 · doi:10.1137/S003614450037906X
[3] Dempster, A. (1972). Covariance selection. Biometrics 28 157–175.
[4] Drton, M. and Perlman, M. (2004). Model selection for Gaussian concentration graphs. Biometrika 91 591–602. · Zbl 1108.62098 · doi:10.1093/biomet/91.3.591
[5] Edwards, D. (2000). Introduction to Graphical Modelling , 2nd ed. Springer, New York. · Zbl 0952.62003 · doi:10.1007/978-1-4612-0493-0
[6] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407–499. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[7] Frank, I. and Friedman, J. (1993). A statistical view of some chemometrics regression tools (with discussion). Technometrics 35 109–148. · Zbl 0775.62288 · doi:10.2307/1269656
[8] Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of over-parametrization. Bernoulli 10 971–988. · Zbl 1055.62078 · doi:10.3150/bj/1106314846
[9] Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R. and Kadie, C. (2000). Dependency networks for inference, collaborative filtering and data visualization. J. Machine Learning Research 1 49–75. · Zbl 1008.68132 · doi:10.1162/153244301753344614
[10] Juditsky, A. and Nemirovski, A. (2000). Functional aggregation for nonparametric regression. Ann. Statist. 28 681–712. · Zbl 1105.62338 · doi:10.1214/aos/1015951994
[11] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356–1378. · Zbl 1105.62357 · doi:10.1214/aos/1015957397
[12] Lauritzen, S. (1996). Graphical Models . Clarendon Press, Oxford. · Zbl 0907.62001
[13] Osborne, M., Presnell, B. and Turlach, B. (2000). On the lasso and its dual. J. Comput. Graph. Statist. 9 319–337. JSTOR: · doi:10.2307/1390657
[14] Shao, J. (1993). Linear model selection by cross-validation. J. Amer. Statist. Assoc. 88 486–494. JSTOR: · Zbl 0773.62051 · doi:10.2307/2290328
[15] Speed, T. and Kiiveri, H. (1986). Gaussian Markov distributions over finite graphs. Ann. Statist. 14 138–150. · Zbl 0589.62033 · doi:10.1214/aos/1176349846
[16] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288. JSTOR: · Zbl 0850.62538
[17] van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes : With Applications to Statistics . Springer, New York. · Zbl 0862.60002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.