×

Composite quantile regression for ultra-high dimensional semiparametric model averaging. (English) Zbl 07422733

Summary: To estimate the joint multivariate regression function, a robust ultra-high dimensional semiparametric model averaging approach is developed. Specifically, a three-stage estimation procedure is proposed. In the first step, the joint multivariate function can be approximated by a weighted average of one-dimensional marginal regression functions which can be estimated robustly by the composite quantile marginal regression. In the second step, a nonparametric composite quantile correlation screening technique is proposed to robustly choose relative important regressors whose marginal regression functions have significant effects on estimating the joint regression function. In the third step, based on these significant regressors that survive the screening procedure, a penalized composite quantile model averaging marginal regression is considered to further achieve sparse model weights and estimate the joint regression function. The sure independence screening property of the proposed screening procedure and sparse property of the penalized estimator are established under some regularity conditions. Numerical studies including both extensive simulation studies and an empirical application are considered to verify the merits of our proposed approach.

MSC:

62-XX Statistics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Chen, J.; Li, D.; Linton, O.; Lu, Z., Semiparametric ultra-high dimensional model averaging of nonlinear dynamic time series, J. Amer. Statist. Assoc., 113, 919-932 (2018) · Zbl 1398.62225
[2] de Boor, C., A Practical Guide to Splines (2001), Springer: Springer New York · Zbl 0987.65015
[3] Fan, J.; Feng, Y.; Song, R., Nonparametric independence screening in sparse ultra-high-dimensional additive models, J. Amer. Statist. Assoc., 106, 544-557 (2011) · Zbl 1232.62064
[4] Fan, J.; Gijbels, I., Local Polynomial Modelling and its Applications (1996), Chapman and Hall: Chapman and Hall London · Zbl 0873.62037
[5] Fan, J.; Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties, J. Amer. Statist. Assoc., 96, 1348-1360 (2001) · Zbl 1073.62547
[6] Fan, J.; Lv, J., Sure independence screening for ultrahigh dimensional feature space, J. R. Stat. Soc. Ser. B Stat. Methodol., 70, 849-911 (2008) · Zbl 1411.62187
[7] Fan, J.; Ma, Y.; Dai, W., Nonparametric independent screening in sparse ultra-high dimensional varying coefficient models, J. Amer. Statist. Assoc., 109, 1270-1284 (2014) · Zbl 1368.62095
[8] Fan, J.; Peng, H., Nonconcave penalized likelihood with a diverging number of parameters, Ann. Statist., 32, 928-961 (2004) · Zbl 1092.62031
[9] Fan, J.; Song, R., Sure independence screening in generalized lin ear models with NP-dimensionality, Ann. Statist., 38, 3567-3604 (2010) · Zbl 1206.68157
[10] Hansen, B. E., Least squares model averaging, Econometrica, 75, 1175-1189 (2007) · Zbl 1133.91051
[11] Hansen, B. E., Least squares forecast averaging, J. Econometrics, 146, 342-350 (2008) · Zbl 1429.62421
[12] Hansen, B. E.; Racine, J. S., Jackknife model averaging, J. Econometrics, 167, 38-46 (2012) · Zbl 1441.62721
[13] Kai, B.; Li, R.; Zou, H., New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models, Ann. Statist., 39, 305-332 (2011) · Zbl 1209.62074
[14] Kong, E.; Xia, Y., An adaptive composite quantile approach to dimension reduction, Ann. Statist., 42, 1657-1688 (2014) · Zbl 1310.62052
[15] Li, G.; Li, Y.; Tsai, C. L., Quantile correlations and quantile autoregressive modeling, J. Amer. Statist. Assoc., 110, 246-261 (2015) · Zbl 1373.62286
[16] Li, D.; Linton, O.; Lu, Z., A flexible semiparametric forecasting model for time series, J. Econometrics, 187, 345-357 (2015) · Zbl 1337.62271
[17] Li, J.; Zheng, Q.; Peng, L.; Huang, Z., Survival impact index and ultrahigh-dimensional model-free screening with survival outcomes, Biometrics, 72, 1145-1154 (2016) · Zbl 1390.62281
[18] Liang, H.; Zou, G.; Wan, A. T.K.; Zhang, X., Optimal weight choice for frequentist model average estimators, J. Amer. Statist. Assoc., 106, 1053-1066 (2011) · Zbl 1229.62090
[19] Liu, J.; Li, R.; Wu, R., Feature selection for varying coefficient models with ultra-high dimensional covariates, J. Amer. Statist. Assoc., 109, 266-274 (2014) · Zbl 1367.62048
[20] Ma, S.; Li, R.; Tsai, C., Variable screening via quantile partial correlation, J. Amer. Statist. Assoc., 112, 650-663 (2017)
[21] Ma, X.; Zhang, J., Robust model-free feature screening via quantile correlation, J. Multivariate Anal., 143, 472-480 (2016) · Zbl 1328.62249
[22] Mai, Q.; Zou, H., The Kolmogorov filter for variable screening in high-dimensional binary classification, Biometrika, 100, 229-234 (2013) · Zbl 1452.62456
[23] Meier, L.; Geer, V.; Bühlmann, P., High-dimensional additive modeling, Ann. Statist., 37, 3779-3821 (2009) · Zbl 1360.62186
[24] Pan, R.; Wang, H.; Li, R., Ultrahigh dimensional multi-class linear discriminant analysis by pairwise sure independence screening, J. Amer. Statist. Assoc., 111, 169-179 (2016)
[25] Rosenwald, A.; Wright, G.; Chan, W. C.; Connors, J. M.; Hermelink, H. K.; Smeland, E. B.; Staudt, L. M., The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma, New Engl. J. Med., 346, 1937-1947 (2002)
[26] Shapiro, S. S.; Wilk, M. B., An analysis of variance test for normality (complete samples), Biometrika, 52, 591-611 (1965) · Zbl 0134.36501
[27] Song, R.; Lu, W.; Ma, S.; Jeng, X. J., Censored rank independence screening for highdimensional survival data, Biometrika, 101, 799-814 (2014) · Zbl 1306.62207
[28] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 58, 267-288 (1996) · Zbl 0850.62538
[29] Wang, H.; Leng, C., Unified LASSO estimation via least squares approximation, J. Amer. Statist. Assoc., 101, 1418-1429 (2007)
[30] Wang, H.; Li, B.; Leng, L., Shrinkage tuning parameter selection with a diverging number of parameters, J. R. Stat. Soc. Ser. B Stat. Methodol., 71, 671-683 (2007) · Zbl 1250.62036
[31] Wang, H.; Li, R.; Tsai, C. L., On the consistency of SCAD tuning parameter selector, Biometrika, 94, 553-558 (2007) · Zbl 1135.62058
[32] Wu, Y.; Yin, G., Conditional qunatile screening in ultrahigh-dimensional heterogeneous data, Biometrika, 102, 65-76 (2015) · Zbl 1345.62097
[33] Xie, j.; Lin, Y.; Yan, X.; Tang, N., Category-adaptive variable screening for ultra-high dimensional heterogeneous categorical data, J. Amer. Statist. Assoc., 115, 747-760 (2020) · Zbl 1445.62020
[34] Xu, K., Model-free feature screening via a modified composite quantile correlation, J. Statist. Plann. Inference, 188, 22-35 (2017) · Zbl 1373.62165
[35] Zhang, X.; Wan, A. T.K.; Zou, G., Model averaging by Jackknife criterion in models with dependent data, J. Econometrics, 174, 82-94 (2013) · Zbl 1283.62059
[36] Zhang, X.; Yu, D.; Zou, G.; Liang, H., Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models, J. Amer. Statist. Assoc., 111, 1775-1790 (2016)
[37] Zhu, L.; Li, L.; Li, R.; Zhu, L., Model-Free feature screening for ultrahigh dimensional data, J. Amer. Statist. Assoc., 106, 1464-1475 (2011) · Zbl 1233.62195
[38] Zou, H., The adaptive lasso and its oracle properties, J. Amer. Statist. Assoc., 101, 1418-1429 (2006) · Zbl 1171.62326
[39] Zou, H.; Yuan, M., Composite quantile regression and the oracle model selection theory, Ann. Statist., 36, 1108-1126 (2008) · Zbl 1360.62394
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.