Aggregation of predictors for nonstationary sub-linear processes and online adaptive forecasting of time varying autoregressive processes. (English) Zbl 1327.62478

Summary: In this work, we study the problem of aggregating a finite number of predictors for nonstationary sub-linear processes. We provide oracle inequalities relying essentially on three ingredients: (1) a uniform bound of the \(\ell^{1}\) norm of the time varying sub-linear coefficients, (2) a Lipschitz assumption on the predictors and (3) moment conditions on the noise appearing in the linear representation. Two kinds of aggregations are considered giving rise to different moment conditions on the noise and more or less sharp oracle inequalities. We apply this approach for deriving an adaptive predictor for locally stationary time varying autoregressive (TVAR) processes. It is obtained by aggregating a finite number of well chosen predictors, each of them enjoying an optimal minimax convergence rate under specific smoothness conditions on the TVAR coefficients. We show that the obtained aggregated predictor achieves a minimax rate while adapting to the unknown smoothness. To prove this result, a lower bound is established for the minimax rate of the prediction risk for the TVAR process. Numerical experiments complete this study. An important feature of this approach is that the aggregated predictor can be computed recursively and is thus applicable in an online prediction context.


62M20 Inference from stochastic processes and prediction
62G99 Nonparametric inference
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
68W27 Online algorithms; streaming algorithms
Full Text: DOI arXiv Euclid


[1] Alquier, P. and Wintenberger, O. (2012). Model selection for weakly dependent time series forecasting. Bernoulli 18 883-913. · Zbl 1243.62117
[2] Anava, O., Hazan, E., Mannor, S. and Shamir, O. (2013). Online learning for time series prediction. Preprint. Available at . arXiv:1302.6927
[3] Arkoun, O. (2011). Sequential adaptive estimators in nonparametric autoregressive models. Sequential Anal. 30 229-247. · Zbl 1215.62079
[4] Audibert, J.-Y. (2009). Fast learning rates in statistical inference through aggregation. Ann. Statist. 37 1591-1646. · Zbl 1360.62167
[5] Brockwell, P. J. and Davis, R. A. (2006). Time Series : Theory and Methods . Springer, New York. Reprint of the second (1991) edition. · Zbl 0709.62080
[6] Catoni, O. (1997). A mixture approach to universal model selection. Technical report, École Normale Supérieure. · Zbl 0928.62033
[7] Catoni, O. (2004). Statistical Learning Theory and Stochastic Optimization. Lecture Notes in Math. 1851 . Springer, Berlin. · Zbl 1076.93002
[8] Cesa-Bianchi, N. and Lugosi, G. (2006). Prediction , Learning , and Games . Cambridge Univ. Press, Cambridge. · Zbl 1114.91001
[9] Dahlhaus, R. (1996). On the Kullback-Leibler information divergence of locally stationary processes. Stochastic Process. Appl. 62 139-168. · Zbl 0849.60032
[10] Dahlhaus, R. (2009). Local inference for locally stationary time series based on the empirical spectral measure. J. Econometrics 151 101-112. · Zbl 1431.62362
[11] Dahlhaus, R. and Polonik, W. (2006). Nonparametric quasi-maximum likelihood estimation for Gaussian locally stationary processes. Ann. Statist. 34 2790-2824. · Zbl 1114.62034
[12] Dahlhaus, R. and Polonik, W. (2009). Empirical spectral processes for locally stationary time series. Bernoulli 15 1-39. · Zbl 1204.62156
[13] Dalalyan, A. S. and Tsybakov, A. B. (2008). Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity. Mach. Learn. 72 39-61.
[14] Doukhan, P. and Wintenberger, O. (2008). Weakly dependent chains with infinite memory. Stochastic Process. Appl. 118 1997-2013. · Zbl 1166.60031
[15] Gerchinovitz, S. (2011). Prediction of individual sequences and prediction in the statistical framework: Some links around sparse regression and aggregation techniques. Ph.D. thesis, Univ. Paris Sud-Paris XI.
[16] Giraud, C., Roueff, F. and Sanchez-Perez, A. (2015). Supplement to “Aggregation of predictors for non stationary sub-linear processes and online adaptive forecasting of time varying autoregressive processes.” . · Zbl 1327.62478
[17] Grenier, Y. (1983). Time-dependent ARMA modeling of nonstationary signals. IEEE Transactions on ASSP 31 899-911.
[18] Juditsky, A. and Nemirovski, A. (2000). Functional aggregation for nonparametric regression. Ann. Statist. 28 681-712. · Zbl 1105.62338
[19] Künsch, H. R. (1995). A note on causal solutions for locally stationary AR-processes. Unpublished preprint, ETH Zürich.
[20] Lepskiĭ, O. V. (1990). A problem of adaptive estimation in Gaussian white noise. Teor. Veroyatn. Primen. 35 459-470. · Zbl 0725.62075
[21] Leung, G. and Barron, A. R. (2006). Information theory and mixing least-squares regressions. IEEE Trans. Inform. Theory 52 3396-3410. · Zbl 1309.94051
[22] Massart, P. (2007). Concentration Inequalities and Model Selection. Lecture Notes in Math. 1896 . Springer, Berlin. · Zbl 1170.60006
[23] Moulines, E., Priouret, P. and Roueff, F. (2005). On recursive estimation for time varying autoregressive processes. Ann. Statist. 33 2610-2654. · Zbl 1084.62089
[24] Rigollet, P. and Tsybakov, A. B. (2012). Sparse estimation by exponential weighting. Statist. Sci. 27 558-575. · Zbl 1331.62351
[25] Sancetta, A. (2010). Recursive forecast combination for dependent heterogeneous data. Econometric Theory 26 598-631. · Zbl 1189.62186
[26] Stoltz, G. (2011). Contributions to the sequential prediction of arbitrary sequences: Applications to the theory of repeated games and empirical studies of the performance of the aggregation of experts. Habilitation à diriger des recherches, Univ. Paris Sud-Paris XI.
[27] Tong, H. and Lim, K. S. (1980). Threshold autoregression, limit cycles and cyclical data. J. Roy. Statist. Soc. Ser. B 42 245-292. · Zbl 0473.62081
[28] Tsybakov, A. B. (2003). Optimal rates of aggregation. In Learning Theory and Kernel Machines (B. Schölkopf and M. K. Warmuth, eds.). Lecture Notes in Computer Science 2777 303-313. Springer, Berlin. · Zbl 1208.62073
[29] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation . Springer, New York. · Zbl 1176.62032
[30] Vovk, V. G. (1990). Aggregating strategies. In Proc. Third Workshop on Computational Learning Theory 371-383. Morgan Kaufmann, San Mateo, CA.
[31] Wang, Z., Paterlini, S., Gao, F. and Yang, Y. (2014). Adaptive minimax regression estimation over sparse \(\ell_{q}\)-hulls. J. Mach. Learn. Res. 15 1675-1711. · Zbl 1319.62016
[32] Yang, Y. (2000a). Combining different procedures for adaptive regression. J. Multivariate Anal. 74 135-161. · Zbl 0964.62032
[33] Yang, Y. (2000b). Mixing strategies for density estimation. Ann. Statist. 28 75-87. · Zbl 1106.62322
[34] Yang, Y. (2004). Combining forecasting procedures: Some theoretical results. Econometric Theory 20 176-222. · Zbl 1046.62101
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.