## Automatic component selection in additive modeling of French national electricity load forecasting.(English)Zbl 1366.62055

Cao, Ricardo (ed.) et al., Nonparametric statistics. 2nd ISNPS, Cádiz, June 2014. Selected papers based on the presentations at the second conference of the International Society for Nonparametric Statistic, ISNPS, Cádiz, Spain, June 12–16, 2014. Cham: Springer (ISBN 978-3-319-41581-9/hbk; 978-3-319-41582-6/ebook). Springer Proceedings in Mathematics & Statistics 175, 191-209 (2016).
Summary: We consider estimation and model selection in sparse high-dimensional linear additive models when multiple covariates need to be modeled nonparametrically, and propose some multi-step estimators based on $$B$$-splines approximations of the additive components. In such models, the overall number of regressors $$d$$ can be large, possibly much larger than the sample size $$n$$. However, we assume that there is a smaller than $$n$$ number of regressors that capture most of the impact of all covariates on the response variable. Our estimation and model selection results are valid without assuming the conventional “separation condition” – namely, without assuming that the norm of each of the true nonzero components is bounded away from zero. Instead, we relax this assumption by allowing the norms of nonzero components to converge to zero at a certain rate. The approaches investigated in this paper consist of two steps. The first step implements the variable selection, typically by the Group Lasso, and the second step applies a penalized $$P$$-splines estimation to the selected additive components. Regarding the model selection task we discuss, the application of several criteria such as Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and Generalized Cross Validation (GCV) and study the consistency of BIC, i.e. its ability to select the true model with probability converging to 1. We then study post-model estimation consistency of the selected components. We end the paper by applying the proposed procedure on some real data related to electricity load consumption forecasting: the EDF (Électricité de France) portfolio.
For the entire collection see [Zbl 1353.62010].

### MSC:

 62P20 Applications of statistics to economics 62G05 Nonparametric estimation 62J07 Ridge regression; shrinkage estimators (Lasso)

### Software:

CRONE; ML; gamair; FIT; Ninteger; ora_foc; R; FOMCON; fderiv; DFOC; FSST; FOPID; Matlab
Full Text:

### References:

 [1] Antoniadis, A., Fan, J.: Regularization of wavelet approximations. J. Am. Stat. Assoc. 96, 939–967 (2001) · Zbl 1072.62561 [2] Antoniadis, A., Gijbels, I., Verhasselt, A.: Variable selection in additive models using P-splines. Technometrics 54(4), 425–438 (2012) [3] Antoniadis, A., Goude, Y., Poggi, J-M., Thouvenot, V.: Sélection de variables dans les modèles additifs avec des estimateurs en plusieurs étapes. Technical report, 2015. https://hal.archives-ouvertes.fr/hal-01116100 [4] Bach, F.R.: Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9, 1179–1225 (2008) · Zbl 1225.68147 [5] Belloni, A., Chernozhukov, V.: Least squares after model selection in high-dimensional sparse models. Bernoulli 19(2), 521–547 (2013) · Zbl 1456.62066 [6] Taieb, S.B., Hyndman, R.J.: A gradient boosting approach to the kaggle load forecasting competition. Int. J. Forecast. 30(2), 382–394 (2014) [7] Buhlmann, P., Van de Geer, S.: Statistics for High-Dimensional Data: Methods. Theory and Applications, 1st edn. Springer Publishing Company, Incorporated (2011) · Zbl 1273.62015 [8] Candes, E., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007) · Zbl 1139.62019 [9] Cantoni, E., Mills Flemming, J., Ronchetti, E.: Variable selection in additive models by nonnegative garrote. Stat. Modell. 11(3):237–252 (2006) · Zbl 05933702 [10] Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11(2), 89–121 (1996) · Zbl 0955.62562 [11] Fan, J., Feng, Y., Song, R.: Nonparametric independence screening in sparse ultra-high-dimensional additive models. J. Am. Stat. Assoc. 106(494), 544–557 (2011) · Zbl 1232.62064 [12] Fan, J., Jiang, J.: Generalized likelihood ratio tests for additive models. J. Am. Stat. Assoc. 100, 890–907 (2005) · Zbl 1117.62328 [13] Fan, S., Hyndman, R.J.: Short-term load forecasting based on a semi-parametric additive model. IEEE Trans. Power Syst. 27(1), 134–141 (2012) [14] Goude, Y., Nedellec, R., Kong, N.: Local short and middle term electricity load forecasting with semi-parametric additive models. IEEE Trans. Smart Grid 5(1), 440–446 (2014) [15] Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman & Hall, London (1990) · Zbl 0747.62061 [16] Hong, T., Pinson, P., Fan, S.: Global energy forecasting competition 2012. Int. J. Forecast. 30(2), 357–363 (2014) [17] Horowitz, J., Klemela, J., Mammen, E.: Optimal estimation in additive regression models. Bernoulli 12(2), 271–298 (2006) · Zbl 1098.62043 [18] Huang, J., Horowitz, J.L., Wei, F.: Variable selection in nonparametric additive models. Ann. Stat. 38(4), 2282–2313 (2010) · Zbl 1202.62051 [19] Kato, K.: Two-step estimation of high dimensional additive models. Technical report, July 2012. http://adsabs.harvard.edu/abs/2012arXiv1207.5313K [20] Koltchinskii, V., Yuan, M.: Sparsity in multiple kernel learning. Ann. Stat. 38(6), 3660–3695 (2010) · Zbl 1204.62086 [21] Lin, Y., Zhang, H.H.: Component selection and smoothing in multivariate nonparametric regression. Ann. Stat. 34(5), 2272–2297 (2006) · Zbl 1106.62041 [22] Marra, G., Wood, S.: Practical variable selection for generalized additive models. Comput. Stat. Data Anal. 55(7), 2372–2387 (2011) · Zbl 1328.62475 [23] Nedellec, R., Cugliari, J., Goude, Y.: Gefcom 2012: electric load forecasting and backcasting with semi-parametric models. Int. J. Forecast. 30(2), 375–381 (2014) [24] Pierrot, A., Goude, Y.: Short-term electricity load forecasting with generalized additive models. In: Proceedings of ISAP power, pp. 593–600 (2011) [25] Raskutti, G., Wainwright, M.J., Yu, B.: Minimax-optimal rates for sparse additive models over kernel classes via convex programming. J. Mach. Learn. Res. 13(1), 389–427 (2012) · Zbl 1283.62071 [26] Ravikumar, P., Lafferty, Jo., Liu, H., Wasserman, L.: Sparse additive models. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 71(5), 1009–1030 (2009) [27] Stone, C.J.: Additive regression and other nonparametric models. Ann. Stat. 13(2), 689–705 (1985) · Zbl 0605.62065 [28] Suzuki, T., Tomioka, R., Sugiyama, M.: Fast convergence rate of multiple kernel learning with elastic-net regularization. arXiv:1103.0431 (2011) [29] Thouvenot, V., Pichavant, A., Goude, Y., Antoniadis, A., Poggi, J.-M.: Electricity forecasting using multi-stage estimators of nonlinear additive models. IEEE Trans. Power Syst. (2015) [30] Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994) · Zbl 0850.62538 [31] Wang, H., Li, B., Leng, C.: Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 71(3), 671–683 (2009) · Zbl 1250.62036 [32] Wood, S.: Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC (2006) · Zbl 1087.62082 [33] Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006) · Zbl 1141.62030 [34] Zhou, S.: Restricted eigenvalue conditions on subgaussian random matrices. Technical report, Dec 2009. http://adsabs.harvard.edu/abs/2009arXiv0912.4045Z
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.