Nonconcave penalized composite conditional likelihood estimation of sparse Ising models. (English) Zbl 1284.62451

Summary: The Ising model is a useful tool for studying complex interactions within a system. The estimation of such a model, however, is rather challenging, especially in the presence of high-dimensional parameters. In this work, we propose efficient procedures for learning a sparse Ising model based on a penalized composite conditional likelihood with nonconcave penalties. Nonconcave penalized likelihood estimation has received a lot of attention in recent years. However, such an approach is computationally prohibitive under high-dimensional Ising models. To overcome such difficulties, we extend the methodology and theory of nonconcave penalized likelihood to penalized composite conditional likelihood estimation. The proposed method can be efficiently implemented by taking advantage of coordinate-ascent and minorization-maximization principles. Asymptotic oracle properties of the proposed method are established with NP-dimensionality. Optimality of the computed local solution is discussed. We demonstrate its finite sample performance via simulation studies and further illustrate our proposal by studying the Human Immunodeficiency Virus type 1 protease structure based on data from the Stanford HIV drug resistance database. Our statistical learning results match the known biological findings very well, although no prior biological information is used in the data analysis procedure.


62J07 Ridge regression; shrinkage estimators (Lasso)
62G20 Asymptotic properties of nonparametric inference
62P10 Applications of statistics to biology and medical sciences; meta analysis


Full Text: DOI arXiv Euclid


[1] Atchley, W. R., Wollenberg, K. R., Fitch, W. M., Terhalle, W. and Dress, A. W. (2000). Correlations among amino acid sites in bHLH protein domains: An information theoretic analysis. Mol. Biol. Evol. 17 164-178.
[2] Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. Ser. B Stat. Methodol. 36 192-236. · Zbl 0327.60067
[3] Bradic, J., Fan, J. and Wang, W. (2011). Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 325-349. · doi:10.1111/j.1467-9868.2010.00764.x
[4] Bradic, J., Fan, J. and Jiang, J. (2011). Regularization for Cox’s proportional hazards model with NP-dimensionality. Ann. Statist. 39 3092-3120. · Zbl 1246.62202
[5] Bühlmann, P. and Meier, L. (2008). Discussion: “One-step sparse estimates in nonconcave penalized likelihood models,” by H. Zou and R. Li. Ann. Statist. 36 1534-1541. · Zbl 1282.62096 · doi:10.1214/07-AOS0316A
[6] Candès, E. J., Wakin, M. B. and Boyd, S. P. (2008). Enhancing sparsity by reweighted \(l_1\) minimization. J. Fourier Anal. Appl. 14 877-905. · Zbl 1176.94014 · doi:10.1007/s00041-008-9045-x
[7] Daubechies, I., Defrise, M. and De Mol, C. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math. 57 1413-1457. · Zbl 1077.65055 · doi:10.1002/cpa.20042
[8] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 39 1-38. · Zbl 0364.62022
[9] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[10] Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statist. Sinica 20 101-148. · Zbl 1180.62080
[11] Fan, J. and Lv, J. (2011). Non-concave penalized likelihood with NP-dimensionality. IEEE Trans. Inform. Theory 57 5467-5484. · Zbl 1365.62277 · doi:10.1109/TIT.2011.2158486
[12] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularized paths for generalized linear models via coordinate descent. Journal of Statistical Software 33 1-22.
[13] Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso. J. Comput. Graph. Statist. 7 397-416.
[14] Genkin, A., Lewis, D. D. and Madigan, D. (2007). Large-scale Bayesian logistic regression for text categorization. Technometrics 49 291-304. · doi:10.1198/004017007000000245
[15] Höfling, H. and Tibshirani, R. (2009). Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J. Mach. Learn. Res. 10 883-906. · Zbl 1245.62121
[16] Hunter, D. R. and Lange, K. (2004). A tutorial on MM algorithms. Amer. Statist. 58 30-37. · doi:10.1198/0003130042836
[17] Hunter, D. R. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617-1642. · Zbl 1078.62028 · doi:10.1214/009053605000000200
[18] Irback, A., Peterson, C. and Potthast, F. (1996). Evidence for nonrandom hydrophobicity structures in protein chains. Proc. Natl. Acad. Sci. USA 93 533-538.
[19] Ising, E. (1925). Beitrag zur theorie des ferromagnetismus. Z. Physik 31 53-258.
[20] Lange, K., Hunter, D. R. and Yang, I. (2000). Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Statist. 9 1-59.
[21] Lindsay, B. G. (1988). Composite likelihood methods. In Statistical Inference from Stochastic Processes ( Ithaca , NY , 1987). Contemporary Mathematics 80 221-239. Amer. Math. Soc., Providence, RI. · Zbl 0672.62069 · doi:10.1090/conm/080/999014
[22] Liu, Y., Eyal, E. and Bahar, I. (2008). Analysis of correlated mutations in HIV-1 protease using spectral clustering. Bioinformatics 24 1243-1250. · Zbl 1154.68497 · doi:10.1016/j.patcog.2008.04.019
[23] Lv, J. and Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. Ann. Statist. 37 3498-3528. · Zbl 1369.62156 · doi:10.1214/09-AOS683
[24] Majewski, J., Li, H. and Ott, J. (2001). The Ising model in physics and statistical genetics. Am. J. Hum. Genet. 69 853-862.
[25] Markowitz, M., Mo, H., Kempf, D. J., Norbeck, D. W., Bhat, T. N., Erickson, J. W. and Ho, D. D. (1995). Selection and analysis of human immunodeficiency virus type 1 variants with increased resistance to ABT-538, a novel protease inhibitor. Journal of Virology 69 701-706.
[26] Meier, L., van de Geer, S. and Bühlmann, P. (2008). The group Lasso for logistic regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 53-71. · Zbl 1400.62276 · doi:10.1111/j.1467-9868.2007.00627.x
[27] Meinshausen, N. (2007). Relaxed Lasso. Comput. Statist. Data Anal. 52 374-393. · Zbl 1452.62522
[28] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[29] Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417-473. · doi:10.1111/j.1467-9868.2010.00740.x
[30] Muzammil, S., Ross, P. and Freire, E. (2003). A major role for a set of non-Active site mutations in the development of HIV-1 protease drug resistance. Biochemistry 42 631-638.
[31] Ohtaka, H., Schön, A. and Freire, E. (2003). Multidrug resistance to HIV-1 protease inhibition requires cooperative coupling between distal mutations. Biochemistry 42 13659-13666.
[32] Ravikumar, P., Wainwright, M. J. and Lafferty, J. (2010). High-dimensional Ising model selection using \(\ell_1\)-regularized logistic regression. Ann. Statist. 38 1287-1319. · Zbl 1189.62115 · doi:10.1214/09-AOS691
[33] Rhee, S.-Y., Liu, T., Ravela, J., Gonzales, M. J. and Shafer, R. W. (2004). Distribution of human immunodeficiency virus type 1 protease and reverse transcriptase mutation patterns in 4,183 persons undergoing genotypic resistance testing. Antimicrob. Agents Chemother. 48 3122-3126.
[34] Rhee, S. Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, D. L. and Shafer, R. W. (2006). Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc. Natl. Acad. Sci. USA 103 17355-17360.
[35] Schelldorfer, J., Bühlmann, P. and van de Geer, S. (2011). Estimation for high-dimensional linear mixed-effects models using \(\ell_1\)-penalization. Scand. J. Stat. 38 197-214. · Zbl 1246.62161 · doi:10.1111/j.1467-9469.2011.00740.x
[36] Städler, N., Bühlmann, P. and van de Geer, S. (2010). \(\ell_1\)-penalization for mixture regression models. TEST 19 209-256. · Zbl 1203.62128 · doi:10.1007/s11749-010-0197-z
[37] Stauffer, D. (2008). Social applications of two-dimensional Ising models. American Journal of Physics 76 470-473.
[38] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267-288. · Zbl 0850.62538
[39] Tisdale, M., Myers, R. E., Maschera, B., Parry, N. R., Oliver, N. M. and Blair, E. D. (1995). Cross-resistance analysis of human immunodeficiency virus type 1 variants individually selected for resistance to five different protease inhibitors. Antimicrob. Agents Chemother. 39 1704-1710.
[40] Tseng, P. (1988). Coordinate ascent for maximizing nondifferentiable concave functions. Technical Report LIDS-P, 1840, Massachusetts Institute of Technology, Laboratory for Information and Decision Systems.
[41] Varin, C. (2008). On composite marginal likelihoods. AStA Adv. Stat. Anal. 92 1-28. · Zbl 1171.62315 · doi:10.1007/s10182-008-0060-7
[42] Varin, C., Reid, N. and Firth, D. (2011). An overview of composite likelihood methods. Statist. Sinica 21 5-42. · Zbl 1534.62022
[43] Wang, H., Li, R. and Tsai, C.-L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94 553-568. · Zbl 1135.62058 · doi:10.1093/biomet/asm053
[44] Wu, M., Cai, T. and Lin, X. (2010). Testing for regression coefficients in lasso regularized regression. Technical report, Harvard Univ.
[45] Wu, T. T. and Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2 224-244. · Zbl 1137.62045 · doi:10.1214/07-AOAS174SUPP
[46] Wu, T. D., Schiffer, C. A., Gonzales, M. J., Taylor, J., Kantor, R., Chou, S., Israelski, D., Zolopa, A. R., Fessel, W. J. and Shafer, R. W. (2003). Mutation patterns and structural correlates in human immunodeficiency virus type 1 protease following different protease inhibitor treatments. J. Virol. 77 4836-4847.
[47] Xue, L., Zou, H. and Cai, T. (2010). Supplement to “Nonconcave penalized composite conditional likelihood estimation of sparse Ising models.” Technical report, School of Statistics, Univ. Minnesota. Available at . · Zbl 1284.62451
[48] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[49] Zhang, C.-H. (2010a). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894-942. · Zbl 1183.62120 · doi:10.1214/09-AOS729
[50] Zhang, T. (2010b). Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11 1081-1107. · Zbl 1242.68262
[51] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541-2563. · Zbl 1222.62008
[52] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
[53] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509-1533. · Zbl 1142.62027 · doi:10.1214/009053607000000802
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.