×

Estimating networks with jumps. (English) Zbl 1295.62032

Summary: We study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying data collected over a period of time, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to i.i.d. data from an invariant model widely considered in current literature of structural estimation. In particular, we consider the scenario in which the model evolves in a piece-wise constant fashion. We propose a procedure that estimates the structure of a graphical model by minimizing the temporally smoothed L1 penalized regression, which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition. A highly scalable proximal gradient method is proposed to solve the resultant convex optimization problem; and the conditions for sparsistent estimation and the convergence rate of both the partition boundaries and the network structure are established for the first time for such estimators.

MSC:

62G05 Nonparametric estimation
62A09 Graphical methods in statistics
62G20 Asymptotic properties of nonparametric inference
05C90 Applications of graph theory
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
91D30 Social networks; opinion dynamics

Software:

glasso
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] A. Ahmed and E. P. Xing. Recovering time-varying networks of dependencies in social and biological studies., Proc. Natl. A. Sci. , 106(29) :11878-11883, 2009.
[2] J. Bai and P. Perron. Estimating and testing linear models with multiple structural changes., Econometrica , 66(1):47-78, 1998. · Zbl 1056.62523
[3] O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data., The Journal of Machine Learning Research , 9: 485-516, 2008a. · Zbl 1225.68149
[4] O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data., J. Mach. Learn. Res. , 9:485-516, 2008b. ISSN 1533-7928. · Zbl 1225.68149
[5] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems., SIAM J. Imaging Sci. , 2(1):183-202, 2009. · Zbl 1175.94009
[6] P. J. Bickel and E. Levina. Regularized estimation of large covariance matrices., Ann. Stat. , 36(1):199-227, 2008. · Zbl 1132.62040
[7] S. Boyd and L. Vandenberghe., Convex Optimization . Cambridge University Press, 2004. · Zbl 1058.90049
[8] P. Brucker. An o(n) algorithm for quadratic knapsack problems., Operations Research Letters , 3(3): 163-166, 1984. · Zbl 0544.90086
[9] F. Bunea. Honest variable selection in linear and logistic regression models via \(\ell_1\) and \(\ell_1+\ell_2\) penalization., Electron. J. Stat. , 2 :1153, 2008. · Zbl 1320.62170
[10] K. R. Davidson and S. J. Szarek. Local operator theory, random matrices and Banach spaces., Handbook of the geometry of Banach spaces , 1: 317-366, 2001. · Zbl 1067.46008
[11] A. P. Dempster. Covariance selection., Biometrics , 28(1):157-175, 1972.
[12] J. Fan, Y. Feng, and Y. Wu. Network exploration via the adaptive LASSO and SCAD penalties., Ann. Appl. Stat. , 3(2):521-541, 2009. · Zbl 1166.62040
[13] J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso., Biostat , 9(3):432-441, 2008. · Zbl 1143.62076
[14] L. Getoor and B. Taskar., Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) . The MIT Press, 2007. · Zbl 1141.68054
[15] J. Guo, E. Levina, G. Michailidis, and J. Zhu. Joint Structure Estimation for Categorical Markov Networks. Technical report, Department of Statistics, University of Michigan, 2010a. · Zbl 1203.62190
[16] J. Guo, E. Levina, G. Michailidis, and J. Zhu. Joint Estimation of Multiple Graphical Models., Biometrika, to appear , 2010b. · Zbl 1214.62058
[17] Zaïd Harchaoui and Céline Lévy-Leduc. Multiple change-point estimation with a total-variation penalty., J. Am. Stat. Soc. , 105(492) :1480-1493, 2010. · Zbl 1388.62211
[18] T. Hastie and R. Tibshirani. Varying-coefficient models., J. R. Stat. Soc. B , 55(4):757-796, 1993. ISSN 00359246. · Zbl 0796.62060
[19] M. Kolar and E. P. Xing. Sparsistent estimation of Time-Varying discrete markov random fields. Technical report, Machine Learning Department, Carnegie Mellon University, 2009. Available at arxiv, 0907.2337.
[20] M. Kolar, A. P. Parikh, and E. P. Xing. On sparse nonparametric conditional covariance selection. In, Proc 27th Ann. Int’l Conf. Machine Learn. , 2010a.
[21] M. Kolar, L. Song, A. Ahmed, and E. P. Xing. Estimating Time-Varying networks., Ann. Appl. Statist , 4(1):94-123, 2010b. · Zbl 1189.62142
[22] S. L. Lauritzen., Graphical Models (Oxford Statistical Science Series) . Oxford University Press, USA, 1996.
[23] H. Li and J. Gui. Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks., Biostatistics , 7(2):302, 2006. · Zbl 1169.62378
[24] J. Liu, S. Wu, and J. V Zidek. On segmented multivariate regression., Stat. Sin. , 7:497-526, 1997. · Zbl 1003.62524
[25] K. Lounici, M. Pontil, A. B. Tsybakov, and S. van de Geer. Taking advantage of sparsity in Multi-Task learning. In, Proc. Conf. Learning Theory , 2009.
[26] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding., The Journal of Machine Learning Research , 11:19-60, 2010. · Zbl 1242.62087
[27] E. Mammen and S. van de Geer. Locally adaptive regression splines., Ann. Statist. , 25(1):387-413, 1997. · Zbl 0871.62040
[28] N. Meinshausen and P. Bühlmann. High-dimensional graphs and variable selection with the lasso., Ann. Stat. , 34(3) :1436-1462, 2006. · Zbl 1113.62082
[29] Y. Nesterov. Smooth minimization of non-smooth functions., Math. Program. , 103(1):127-152, 2005. · Zbl 1079.90102
[30] Y. Nesterov. Gradient methods for minimizing composite objective function. Technical Report 76 :2007, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, 2007.
[31] J. Peng, P. Wang, N. Zhou, and J. Zhu. Partial correlation estimation by joint sparse regression models., J. Am. Stat. Ass. , 104(486):735-746, 2009. · Zbl 1388.62046
[32] P. Ravikumar, M. J. Wainwright, G. Raskutti, and B. Yu. High-dimensional covariance estimation by minimizing \(\ell_1\)-penalized log-determinant divergence. Technical report, Department of Statistics, University of California, Berkeley, 2008. · Zbl 1274.62190
[33] P. Ravikumar, M. J. Wainwright, and J. D. Lafferty. High-dimensional ising model selection using \(\ell_1\) regularized logistic regression., Ann. Stat. , 38(3) :1287-1319, 2010. · Zbl 1189.62115
[34] A. Rinaldo. Properties and refinements of the fused lasso., Ann. Stat. , 37(5) :2922-2952, 2009. · Zbl 1173.62027
[35] A. J. Rothman, P. J. Bickel, E. Levina, and J. Zhu. Sparse permutation invariant covariance estimation., Electron. J. Stat. , 2:494-515, 2008. · Zbl 1320.62135
[36] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso., J. R. Stat. Soc. B , 67(1):91-108, 2005. · Zbl 1060.62049
[37] S. van de Geer and P. Bühlmann. On the conditions used to prove oracle results for the lasso., Electron. J. Stat. , 3 :1360-1392, 2009. · Zbl 1327.62425
[38] M. J. Wainwright. Sharp thresholds for high-dimensional and noisy sparsity recovery using \(\ell_1\) -constrained quadratic programming (lasso)., IEEE T. Inform. Theory , 55(5) :2183-2202, 2009. · Zbl 1367.62220
[39] M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference., Found. Trends Mach. Learn. , 1(1-2):1-305, 2008. · Zbl 1193.62107
[40] P. Wang, D. L. Chao, and L. Hsu. Learning networks from high dimensional binary data: An application to genomic instability data., Biometrics, to appear , 2009. · Zbl 1216.62180
[41] J. Yin, Z. Geng, R. Li, and H. Wang. Nonparametric Covariance Model., Statistica Sinica , 20:469-479, 2010. · Zbl 1180.62065
[42] M. Yuan and Y. Lin. Model selection and estimation in the gaussian graphical model., Biometrika , 94(1):19-35, 2007. · Zbl 1142.62408
[43] P. Zhao and B. Yu. On model selection consistency of lasso., J. Mach. Learn. Res. , 7 :2541-2563, 2006. · Zbl 1222.62008
[44] S. Zhou, J. Lafferty, and L. Wasserman. Time varying undirected graphs. In, Proc. Conf. Learning Theory , pages 455-466, 2008.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.