Variable screening for high dimensional time series. (English) Zbl 06864473
Summary: Variable selection is a widely studied problem in high dimensional statistics, primarily since estimating the precise relationship between the covariates and the response is of great importance in many scientific disciplines. However, most of theory and methods developed towards this goal for the linear model invoke the assumption of iid sub-Gaussian covariates and errors. This paper analyzes the theoretical properties of Sure Independence Screening (SIS) [J. Fan and J. Lv, “Sure independence screening for ultrahigh dimensional feature space”, J. R. Stat. Soc. Ser. B (Statistical Methodol. 70, No. 5, 849–911 (2008; doi:10.1111/j.1467-9868.2008.00674.x)] for high dimensional linear models with dependent and/or heavy tailed covariates and errors. We also introduce a generalized least squares screening (GLSS) procedure which utilizes the serial correlation present in the data. By utilizing this serial correlation when estimating our marginal effects, GLSS is shown to outperform SIS in many cases. For both procedures we prove sure screening properties, which depend on the moment conditions, and the strength of dependence in the error and covariate processes, amongst other factors. Additionally, combining these screening procedures with the adaptive Lasso is analyzed. Dependence is quantified by functional dependence measures [W. B. Wu, Proc. Natl. Acad. Sci. USA 102, No. 40, 14150–14154 (2005; Zbl 1135.62075)], and the results rely on the use of Nagaev-type and exponential inequalities for dependent random variables. We also conduct simulations to demonstrate the finite sample performance of these procedures, and include a real data application of forecasting the US inflation rate.

62F07 Statistical ranking and selection procedures
62J07 Ridge regression; shrinkage estimators (Lasso)
