×

New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. (English) Zbl 1209.62074

Summary: The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. We propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of the proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors.
In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data.

MSC:

62G08 Nonparametric regression and quantile regression
62G20 Asymptotic properties of nonparametric inference
62H12 Estimation in multivariate analysis
62F12 Asymptotic properties of parametric estimators
65C05 Monte Carlo methods
62G05 Nonparametric estimation

Software:

SemiPar

References:

[1] Bradic, J., Fan, J. and Wang, W. (2010). Penalized composite quasi-likelihood for ultrahigh-dimensional variable selection. Available at . · Zbl 1203.62150 · doi:10.1007/s11749-009-0173-7
[2] Cai, Z. and Xu, X. (2009). Nonparametric quantile estimations for dynamic smooth coefficient models. J. Amer. Statist. Assoc. 104 371-383. · Zbl 1375.62003 · doi:10.1198/jasa.2009.0102
[3] Carroll, R., Fan, J., Gijbels, I. and Wand, M. (1997). Generalized partially linear single-index models. J. Amer. Statist. Assoc. 92 477-489. JSTOR: · Zbl 0890.62053 · doi:10.2307/2965697
[4] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications . Chapman & Hall, London. · Zbl 0873.62037
[5] Fan, J. and Huang, T. (2005). Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11 1031-1057. · Zbl 1098.62077 · doi:10.3150/bj/1137421639
[6] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1361. JSTOR: · Zbl 1073.62547 · doi:10.1198/016214501753382273
[7] Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In Proceedings of the International Congress of Mathematicians (M. Sanz-Sole, J. Soria, J. Varona and J. Verdera, eds.) III 595-622. Eur. Math. Soc., Zürich. · Zbl 1117.62137
[8] Geyer, C. (1994). On the asymptotics of constrained M -estimation. Ann. Statist. 22 1993-2010. · Zbl 0829.62029 · doi:10.1214/aos/1176325768
[9] Härdle, W., Liang, H. and Gao, J. (2000). Partially Linear Models . Physica Verlag, Heidelberg. · Zbl 0968.62006
[10] He, X. and Shi, P. (1996). Bivariate tensor-product B-splines in a partly linear model. J. Multivariate Anal. 58 162-181. · Zbl 0865.62027 · doi:10.1006/jmva.1996.0045
[11] He, X., Zhu, Z. and Fung, W. (2002). Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89 579-590. JSTOR: · Zbl 1036.62035 · doi:10.1093/biomet/89.3.579
[12] Hunter, D. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617-1642. · Zbl 1078.62028 · doi:10.1214/009053605000000200
[13] Kai, B., Li, R. and Zou, H. (2010). Local composite quantile regression smoothing: An efficient and safe alternative to local polynomial regression. J. Roy. Statist. Soc. Ser. B 72 49-69.
[14] Knight, K. (1998). Limiting distributions for L 1 regression estimators under general conditions. Ann. Statist. 26 755-770. · Zbl 0929.62021 · doi:10.1214/aos/1028144858
[15] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356-1378. · Zbl 1105.62357 · doi:10.1214/aos/1015957397
[16] Koenker, R. (1984). A note on L -estimates for linear models. Statist. Probab. Lett. 2 323-325. · Zbl 0626.62029 · doi:10.1016/0167-7152(84)90040-3
[17] Koenker, R. (2005). Quantile Regression . Cambridge Univ. Press, Cambridge. · Zbl 1111.62037
[18] Lam, C. and Fan, J. (2008). Profile-kernel likelihood inference with diverging number of parameters. Ann. Statist. 36 2232-2260. · Zbl 1274.62289 · doi:10.1214/07-AOS544
[19] Lee, S. (2003). Efficient semiparametric estimation of a partially linear quantile regression model. Econometric Theory 19 1-31. JSTOR: · Zbl 1031.62034 · doi:10.1017/S0266466603191013
[20] Leng, C. (2010). Variable selection and coefficient estimation via regularized rank regression. Statist. Sinica 20 167-181. · Zbl 1180.62058
[21] Li, R. and Liang, H. (2008). Variable selection in semiparametric regression modeling. Ann. Statist. 36 261-286. · Zbl 1132.62027 · doi:10.1214/009053607000000604
[22] Li, Y. and Zhu, J. (2007). L 1 -norm quantile regression. J. Comput. Graph. Statist. 17 163-185.
[23] Mack, Y. and Silverman, B. (1982). Weak and strong uniform consistency of kernel regression estimates. Probab. Theory Related Fields 61 405-415. · Zbl 0495.62046 · doi:10.1007/BF00539840
[24] Nierenberg, D., Stukel, T., Baron, J., Dain, B. and Greenberg, E. (1989). Determinants of plasma levels of beta-carotene and retinol. American Journal of Epidemiology 130 511-521.
[25] Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators. Econometric Theory 7 186-199. JSTOR: · doi:10.1017/S0266466600004394
[26] Ruppert, D., Wand, M. and Carroll, R. (2003). Semiparametric Regression . Cambridge Univ. Press, Cambridge. · Zbl 1038.62042
[27] Wang, H., Li, R. and Tsai, C. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94 553-568. · Zbl 1135.62058 · doi:10.1093/biomet/asm053
[28] Wang, H. and Xia, Y. (2009). Shrinkage estimation of the varying coefficient model. J. Amer. Statist. Assoc. 104 747-757. · Zbl 1388.62213 · doi:10.1198/jasa.2009.0138
[29] Wang, L. and Li, R. (2009). Weighted Wilcoxon-type smoothly clipped absolute deviation method. Biometrics 65 564-571. · Zbl 1167.62093 · doi:10.1111/j.1541-0420.2008.01099.x
[30] Wu, Y. and Liu, Y. (2009). Variable selection in quantile regression. Statist. Sinica 19 801-817. · Zbl 1166.62012
[31] Xia, Y., Zhang, W. and Tong, H. (2004). Efficient estimation for semivarying-coefficient models. Biometrika 91 661-681. · Zbl 1108.62019 · doi:10.1093/biomet/91.3.661
[32] Yatchew, A. (2003). Semiparametric Regression for the Applied Econometrician . Cambridge Univ. Press, Cambridge. · Zbl 1067.62041
[33] Zhang, W., Lee, S. and Song, X. (2002). Local polynomial fitting in semivarying coefficient model. J. Multivariate Anal. 82 166-188. · Zbl 0995.62038 · doi:10.1006/jmva.2001.2012
[34] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann. Statist. 36 1509-1533. · Zbl 1142.62027 · doi:10.1214/009053607000000802
[35] Zou, H. and Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. Ann. Statist. 36 1108-1126. · Zbl 1360.62394 · doi:10.1214/07-AOS507
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.