×

Variable selection in high-dimensional partially linear additive models for composite quantile regression. (English) Zbl 1471.62081

Summary: A new estimation procedure based on the composite quantile regression is proposed for the semiparametric additive partial linear models, of which the nonparametric components are approximated by polynomial splines. The proposed estimation method can simultaneously estimate both the parametric regression coefficients and nonparametric components without any specification of the error distributions. The proposed estimation method is empirically shown to be much more efficient than the popular least-squares-based estimation method for non-normal random errors, especially for Cauchy error, and almost as efficient for normal random errors. To achieve sparsity in high-dimensional and sparse additive partial linear models, of which the number of linear covariates is much larger than the sample size but that of significant covariates is small relative to the sample size, a variable selection procedure based on adaptive Lasso is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property, and is much superior to the adaptive Lasso penalized least-squares-based method regardless of the random error distributions. In particular, two kinds of weights in the penalty are considered, namely the composite quantile regression estimates and Lasso penalized composite quantile regression estimates. Both types of weights perform very well with the latter performing especially well in terms of precisely selecting significant variables. The simulation results are consistent with the theoretical properties. A real data example is used to illustrate the application of the proposed methods.

MSC:

62-08 Computational methods for problems pertaining to statistics
62G08 Nonparametric regression and quantile regression
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Breiman, L.; Friedman, J., Estimating optimal transformations for multiple regression and correlation, J. Amer. Statist. Assoc., 80, 580-598, (1985) · Zbl 0594.62044
[2] Carroll, R.; Fan, J.; Gijbels, I.; Wand, M., Generalized partially linear single-index models, J. Amer. Statist. Assoc., 92, 477-489, (1997) · Zbl 0890.62053
[3] Chen, J.; Chen, Z., Extended Bayesian information criteria for model selection with large model space, Biometrika, 95, 759-771, (2008) · Zbl 1437.62415
[4] Du, P.; Cheng, G.; Liang, H., Semiparametric regression models with additive nonparametric components and high dimensional parametric components, Comput. Statist. Data Anal., 56, 2006-2017, (2012) · Zbl 1243.62053
[5] Fan, J.; Härdle, W.; Mammen, E., Direct estimation of low dimensional components in additive models, Ann. Statist., 26, 943-971, (1998) · Zbl 1073.62527
[6] Fan, J.; Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties, J. Amer. Statist. Assoc., 96, 1348-1360, (2001) · Zbl 1073.62547
[7] Frank, I. E.; Friedman, J. H., A statistical view of some chemometrics tools, Technometrics, 35, 109-135, (1993) · Zbl 0775.62288
[8] Guo, J.; Tian, M. Z.; Zhu, K., New efficient and robust estimation in varying-coefficient models with heteroscedasticity, Statist. Sinica, 22, 1075-1101, (2012) · Zbl 1257.62039
[9] Hastie, T. J.; Tibshirani, R. J., Generalized additive models, Statist. Sci., 1, 297-310, (1986)
[10] Hastie, T. J.; Tibshirani, R. J., Generalized additive models, (1990), Chapman and Hall London · Zbl 0747.62061
[11] Huang, J.; Horowitz, J. L.; Wei, R., Variable selection in nonparametric additive models, Ann. Statist., 38, 2282-2313, (2010) · Zbl 1202.62051
[12] Knight, K., Limiting distributions for \(L_1\) regression estimators under general conditions, Ann. Statist., 28, 1356-1378, (1998)
[13] Koenker, R., Additive models for quantile regression: model selection and confidence bands, Braz. J. Probab. Stat., 25, 239-262, (2011) · Zbl 1236.62031
[14] Liang, H.; Thurston, S.; Ruppert, D.; Apanasovich, T.; Hauser, R., Additive partial linear models with measurement errors, Biometrika, 95, 667-678, (2008) · Zbl 1437.62526
[15] Liu, X.; Wang, L.; Liang, H., Estimation and variable selection for semiparametric additive partial linear models, Statist. Sinica, 21, 1225-1248, (2011) · Zbl 1223.62020
[16] Marra, G.; Wood, S. N., Practical variable selection for generalized additive models, Comput. Statist. Data Anal., 55, 2372-2387, (2011) · Zbl 1328.62475
[17] Opsomer, J. D.; Ruppert, D., Fitting a bivariate additive model by local polynomial regression, Ann. Statist., 25, 186-211, (1997) · Zbl 0869.62026
[18] Opsomer, J. D.; Ruppert, D., A root-\(n\) consistent backfitting estimator for semiparametric additive modeling, J. Comput. Graph. Statist., 8, 715-732, (1999)
[19] Schumaker, L., Spline functions: basic theory, (1981), Wiley New York · Zbl 0449.41004
[20] Stone, C. J., Additive regression and other nonparametric models, Ann. Statist., 13, 689-705, (1985) · Zbl 0605.62065
[21] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B, 58, 267-288, (1996) · Zbl 0850.62538
[22] Wang, H.; Li, R.; Tsai, C., Turning parameter selectors for the smoothly clipped absolute deviation method, Biometrika, 65, 553-568, (2007) · Zbl 1135.62058
[23] Wei, F., Group selection in high-dimensional partially linear additive models, Braz. J. Probab. Stat., 26, 219-243, (2012) · Zbl 1239.62048
[24] Zou, H., The adaptive lasso and its oracle properties, J. Amer. Statist. Assoc., 101, 1418-1429, (2006) · Zbl 1171.62326
[25] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B, 67, 301-320, (2006) · Zbl 1069.62054
[26] Zou, H.; Yuan, M., Composite quantile regression and the oracle model selection theory, Ann. Statist., 36, 1108-1126, (2008) · Zbl 1360.62394
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.