## On the choice of difference sequence in a unified framework for variance estimation in nonparametric regression.(English)Zbl 1442.62081

Summary: Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

### MSC:

 62G08 Nonparametric regression and quantile regression 62G20 Asymptotic properties of nonparametric inference

VarED; R
Full Text:

### References:

 [1] Benko, M., Härdle, W. and Kneip, A. (2009). Common functional principal components. Ann. Statist.37 1-34. · Zbl 1169.62057 [2] Berkey, C. S. (1982). Bayesian approach for a nonlinear growth model. Biometrics 38 953-961. [3] Bliznyuk, N., Carroll, R. J., Genton, M. G. and Wang, Y. (2012). Variogram estimation in the presence of trend. Stat. Interface 5 159-168. · Zbl 1383.62165 [4] Brown, L. D. and Levine, M. (2007). Variance estimation in nonparametric regression via the difference sequence method. Ann. Statist.35 2219-2232. · Zbl 1126.62024 [5] Charnigo, R., Hall, B. and Srinivasan, C. (2011). A generalized $$C_{p}$$ criterion for derivative estimation. Technometrics 53 238-253. [6] Cheng, M.-Y., Peng, L. and Wu, J.-S. (2007). Reducing variance in univariate smoothing. Ann. Statist.35 522-542. · Zbl 1117.62038 [7] Cook, J. R. and Stefanski, L. A. (1995). Simulation-extrapolation estimation in parametric measurement error models. J. Amer. Statist. Assoc.89 1314-1328. · Zbl 0810.62028 [8] Dai, W., Tong, T. and Genton, M. G. (2016). Optimal estimation of derivatives in nonparametric regression. J. Mach. Learn. Res.17(164) 1-25. · Zbl 1392.62110 [9] Dai, W., Tong, T. and Zhu, L. (2017). Supplement to “On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression.” DOI:10.1214/17-STS613SUPP. [10] Dai, W., Ma, Y., Tong, T. and Zhu, L. (2015). Difference-based variance estimation in nonparametric regression with repeated measurement data. J. Statist. Plann. Inference 163 1-20. · Zbl 1321.62034 [11] Dette, H. and Hetzler, B. (2009). A simple test for the parametric form of the variance function in nonparametric regression. Ann. Inst. Statist. Math.61 861-886. · Zbl 1332.62131 [12] Dette, H., Munk, A. and Wagner, T. (1998). Estimating the variance in nonparametric regression—what is a reasonable choice? J. Roy Statist. Soc. Ser. B.60 751-764. · Zbl 0944.62041 [13] De Brabanter, K., De Brabanter, J., De Moor, B. and Gijbels, I. (2013). Derivative estimation with local polynomial fitting. J. Mach. Learn. Res.14 281-301. · Zbl 1320.62088 [14] Einmahl, J. H. J. and Van Keilegom, I. (2008). Tests for independence in nonparametric regression. Statist. Sinica 18 601-615. · Zbl 1135.62032 [15] Eubank, R. L. and Spiegelman, C. H. (1990). Testing the goodness of fit of a linear model via nonparametric regression techniques. J. Amer. Statist. Assoc.85 387-392. · Zbl 0702.62037 [16] Gasser, T., Kneip, A. and Köhler, W. (1991). A flexible and fast method for automatic smoothing. J. Amer. Statist. Assoc.86 643-652. · Zbl 0733.62047 [17] Gasser, T., Sroka, L. and Jennen-Steinmetz, C. (1986). Residual variance and residual pattern in nonlinear regression. Biometrika 73 625-633. · Zbl 0649.62035 [18] Hall, P. and Heckman, N. E. (2000). Testing for monotonicity of a regression mean by calibrating for linear functions. Ann. Statist.28 20-39. · Zbl 1106.62324 [19] Hall, P., Kay, J. W. and Titterington, D. M. (1990). Asymptotically optimal difference-based estimation of variance in nonparametric regression. Biometrika 77 521-528. · Zbl 1377.62102 [20] Hall, P. and Keilegom, I. V. (2003). Using difference-based methods for inference in nonparametric regression with time series errors. J. Roy. Statist. Soc. Ser. B 65 443-456. · Zbl 1065.62067 [21] Hall, P. and Marron, J. S. (1990). On variance estimation in nonparametric regression. Biometrika 77 415-419. · Zbl 0711.62035 [22] Härdle, W. (1990). Applied Nonparametric Regression. Cambridge Univ. Press, Cambridge. [23] Härdle, W. and Tsybakov, A. (1997). Local polynomial estimators of the volatility function in nonparametric autoregression. J. Econometrics 81 223-242. · Zbl 0904.62047 [24] Müller, H.-G. and Stadtmüller, U. (1999). Discontinuous versus smooth regression. Ann. Statist.27 299-337. [25] Munk, A., Bissantz, N., Wagner, T. and Freitag, G. (2005). On difference-based variance estimation in nonparametric regression when the covariate is high dimensional. J. Roy. Statist. Soc. Ser. B 67 19-41. · Zbl 1060.62047 [26] Paige, R. L., Sun, S. and Wang, K. (2009). Variance reduction in smoothing splines. Scand. J. Stat.36 112-126. · Zbl 1194.62053 [27] Park, C., Kim, I. and Lee, Y. (2012). Error variance estimation via least squares for small sample nonparametric regression. J. Statist. Plann. Inference 142 2369-2385. · Zbl 1244.62061 [28] Pendakur, K. and Sperlich, S. (2010). Semiparametric estimation of consumer demand systems in real expenditure. J. Appl. Econometrics 25 420-457. [29] Rice, J. A. (1984). Bandwidth choice for nonparametric regression. Ann. Statist.12 1215-1230. · Zbl 0554.62035 [30] Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. J. Amer. Statist. Assoc.92 1049-1062. · Zbl 1067.62531 [31] Shen, H. and Brown, L. D. (2006). Non-parametric modelling of time-varying customer service times at a bank call centre. Appl. Stoch. Models Bus. Ind.22 297-311. · Zbl 1114.62055 [32] Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics 75 317-343. · Zbl 0864.62025 [33] Stefanski, L. A. and Cook, J. R. (1995). Simulation-extrapolation: the measurement error jackknife. J. Amer. Statist. Assoc.90 1247-1256. · Zbl 0868.62062 [34] Tabakan, G. and Akdeniz, F. (2010). Difference-based ridge estimator of parameters in partial linear model. Statist. Papers 51 357-368. · Zbl 1247.62182 [35] Tong, T., Ma, Y. and Wang, Y. (2013). Optimal variance estimation without estimating the mean function. Bernoulli 19 1839-1854. · Zbl 1281.62105 [36] Tong, T. and Wang, Y. (2005). Estimating residual variance in nonparametric regression using least squares. Biometrika 92 821-830. · Zbl 1151.62318 [37] Wahba, G. (1983). Bayesian “confidence intervals” for the cross-validated smoothing spline. J. Roy. Statist. Soc. Ser. B 45 133-150. · Zbl 0538.65006 [38] Wang, Y. (2011). Smoothing Splines: Methods and Applications. Chapman & Hall, New York. · Zbl 1223.65011 [39] Wang, L., Brown, L. D. and Cai, T. (2011). A difference based approach to the semiparametric partial linear model. Electron. J. Stat.5 619-641. · Zbl 1329.62179 [40] Wang, W. W. and Lin, L. (2015). Derivative estimation based on difference sequence via locally weighted least squares regression. J. Mach. Learn. Res.16 2617-2641. · Zbl 1351.62095 [41] Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. J. Amer. Statist. Assoc.93 120-131. · Zbl 0920.62056 [42] Zhou, Y., Cheng, Y., Wang, L. and Tong, T. (2015). Optimal difference-based variance estimation in heteroscedastic nonparametric regression. Statist. Sinica 25 1377-1397. · Zbl 1377.62122
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.