Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data. (English) Zbl 1421.62044

Summary: Sure screening technique has been considered as a powerful tool to handle the ultrahigh dimensional variable selection problems, where the dimensionality \(p\) and the sample size \(n\) can satisfy the NP dimensionality \(\log p=O(n^{a})\) for some \(a>0\) [J. Fan and J. Lv, J. R. Stat. Soc., Ser. B, Stat. Methodol. 70, No. 5, 849–911 (2008; Zbl 1411.62187)]. The current paper aims to simultaneously tackle the “universality” and “effectiveness” of sure screening procedures. For the “universality,“ we develop a general and unified framework for nonparametric screening methods from a loss function perspective. Consider a loss function to measure the divergence of the response variable and the underlying nonparametric function of covariates. We newly propose a class of loss functions called conditional strictly convex loss, which contains, but is not limited to, negative log likelihood loss from one-parameter exponential families, exponential loss for binary classification and quantile regression loss. The sure screening property and model selection size control will be established within this class of loss functions. For the “effectiveness”, we focus on a goodness-of-fit nonparametric screening (Goffins) method under conditional strictly convex loss. Interestingly, we can achieve a better convergence probability of containing the true model compared with related literature. The superior performance of our proposed method has been further demonstrated by extensive simulation studies and some real scientific data example.


62G09 Nonparametric statistical resampling methods
62P10 Applications of statistics to biology and medical sciences; meta analysis


Zbl 1411.62187


Full Text: DOI arXiv Euclid


[1] Anderson, M. J. and Robinson, J. (2001). Permutation tests for linear models. Aust. N. Z. J. Stat.43 75-88. · Zbl 0992.62043 · doi:10.1111/1467-842X.00156
[2] Barut, E., Fan, J. and Verhasselt, A. (2016). Conditional sure independence screening. J. Amer. Statist. Assoc.111 1266-1277.
[3] Brègman, L. M. (1967). A relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming. Ž. Vyčisl. Mat. Mat. Fiz.7 620-631. · Zbl 0186.23807
[4] mr Buldygin, V.and Kozachenko, Y. (2000). Metric characterization of random variables and random processes.. Translations of Mathematical Monographs188.
[5] Candès, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Ann. Statist.35 2313-2404. · Zbl 1139.62019
[6] Chang, J., Tang, C. Y. and Wu, Y. (2013). Marginal empirical likelihood and sure independence feature screening. Ann. Statist.41 2123-2148. · Zbl 1277.62109 · doi:10.1214/13-AOS1139
[7] de Boor, C. (1978). A Practical Guide to Splines. Applied Mathematical Sciences27. Springer, New York. · Zbl 0406.41003
[8] Fan, J. and Fan, Y. (2008). High-dimensional classification using features annealed independence rules. Ann. Statist.36 2605-2637. · Zbl 1360.62327 · doi:10.1214/07-AOS504
[9] Fan, J., Feng, Y. and Song, R. (2011). Nonparametric independence screening in sparse ultra-high-dimensional additive models. J. Amer. Statist. Assoc.106 544-557. · Zbl 1232.62064 · doi:10.1198/jasa.2011.tm09779
[10] Fan, J., Feng, Y. and Tong, X. (2012). A road to classification in high dimensional space: The regularized optimal affine discriminant. J. R. Stat. Soc. Ser. B. Stat. Methodol.74 745-771. · Zbl 1411.62167
[11] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc.96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[12] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B. Stat. Methodol.70 849-911. · Zbl 1411.62187
[13] Fan, J., Ma, Y. and Dai, W. (2014). Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models. J. Amer. Statist. Assoc.109 1270-1284. · Zbl 1368.62095 · doi:10.1080/01621459.2013.879828
[14] Fan, J., Samworth, R. and Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. J. Mach. Learn. Res.10 2013-2038. · Zbl 1235.62089
[15] Fan, J. and Song, R. (2010). Sure independence screening in generalized linear models with NP-dimensionality. Ann. Statist.38 3567-3604. · Zbl 1206.68157 · doi:10.1214/10-AOS798
[16] Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci.55 119-139. · Zbl 0880.68103 · doi:10.1006/jcss.1997.1504
[17] Gao, Q., Wu, Y., Zhu, C. and Wang, Z. (2008). Asymptotic normality of maximum quasi-likelihood estimators in generalized linear models with fixed design. J. Syst. Sci. Complex.21 463-473. · Zbl 1206.62130 · doi:10.1007/s11424-008-9128-4
[18] Gordon, G. et al. (2002). Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and Mesothelioma. Cancer Research62 4963-4967.
[19] Han, X. (2019). Supplement to “Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data.” DOI:10.1214/18-AOS1738SUPP. · Zbl 1421.62044
[20] He, X., Wang, L. and Hong, H. G. (2013). Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann. Statist.41 342-369. · Zbl 1295.62053 · doi:10.1214/13-AOS1087
[21] Heyde, C. C. (1997). Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation. Springer, New York. · Zbl 0879.62076
[22] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs38. Cambridge Univ. Press, Cambridge. · Zbl 1111.62037
[23] Laurent, B. and Massart, P. (2000). Adaptive estimation of a quadratic functional by model selection. Ann. Statist.28 1302-1338. · Zbl 1105.62328 · doi:10.1214/aos/1015957395
[24] Li, R., Zhong, W. and Zhu, L. (2012). Feature screening via distance correlation learning. J. Amer. Statist. Assoc.107 1129-1139. · Zbl 1443.62184 · doi:10.1080/01621459.2012.695654
[25] Li, G., Peng, H., Zhang, J. and Zhu, L. (2012). Robust rank correlation based screening. Ann. Statist.40 1846-1877. · Zbl 1257.62067 · doi:10.1214/12-AOS1024
[26] Mai, Q. and Zou, H. (2015). The fused Kolmogorov filter: A nonparametric model-free screening method. Ann. Statist.43 1471-1497. · Zbl 1431.62216 · doi:10.1214/14-AOS1303
[27] Meier, L., van de Geer, S. and Bühlmann, P. (2009). High-dimensional additive modeling. Ann. Statist.37 3779-3821. · Zbl 1360.62186 · doi:10.1214/09-AOS692
[28] Song, R., Lu, W., Ma, S. and Jeng, X. J. (2014). Censored rank independence screening for high-dimensional survival data. Biometrika101 799-814. · Zbl 1306.62207 · doi:10.1093/biomet/asu047
[29] Stone, C. J. (1986). The dimensionality reduction principle for generalized additive models. Ann. Statist.14 590-606. · Zbl 0603.62050 · doi:10.1214/aos/1176349940
[30] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B58 267-288. · Zbl 0850.62538 · doi:10.1111/j.2517-6161.1996.tb02080.x
[31] Weng, H., Feng, Y. and Qiao, X. (2017). Regularization after retention in ultrahigh dimensional linear regression models. Statist. Sinica. In press. · Zbl 1412.62098
[32] Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist.38 894-942. · Zbl 1183.62120 · doi:10.1214/09-AOS729
[33] Zhang, C., Jiang, Y. and Shang, Z. (2009). New aspects of Bregman divergence in regression and classification with parametric and nonparametric estimation. Canad. J. Statist.37 119-139. · Zbl 1170.62037 · doi:10.1002/cjs.10005
[34] Zhao, S. D. and Li, Y. (2012). Principled sure independence screening for Cox models with ultra-high-dimensional covariates. J. Multivariate Anal.105 397-411. · Zbl 1233.62173 · doi:10.1016/j.jmva.2011.08.002
[35] Zhu, L.-P., Li, L., Li, R. and Zhu, L.-X. (2011). Model-free feature screening for ultrahigh-dimensional data. J. Amer. Statist. Assoc.106 1464-1475. · Zbl 1233.62195 · doi:10.1198/jasa.2011.tm10563
[36] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc.101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
[37] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B. Stat. Methodol.67 301-320. · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
[38] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist.36 1509-1533. · Zbl 1142.62027 · doi:10.1214/009053607000000802
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.