×

zbMATH — the first resource for mathematics

More efficient approximation of smoothing splines via space-filling basis selection. (English) Zbl 07263823
Summary: We consider the problem of approximating smoothing spline estimators in a nonparametric regression model. When applied to a sample of size \(n\), the smoothing spline estimator can be expressed as a linear combination of \(n\) basis functions, requiring \(O(n^3)\) computational time when the number \(d\) of predictors is two or more. Such a sizeable computational cost hinders the broad applicability of smoothing splines. In practice, the full-sample smoothing spline estimator can be approximated by an estimator based on \(q\) randomly selected basis functions, resulting in a computational cost of \(O(nq^2)\). It is known that these two estimators converge at the same rate when \(q\) is of order \(O\{n^{2/(pr+1)}\}\), where \(p \in [1,2]\) depends on the true function and \(r > 1\) depends on the type of spline. Such a \(q\) is called the essential number of basis functions. In this article, we develop a more efficient basis selection method. By selecting basis functions corresponding to approximately equally spaced observations, the proposed method chooses a set of basis functions with great diversity. The asymptotic analysis shows that the proposed smoothing spline estimator can decrease \(q\) to around \(O\{n^{1/(pr+1)}\}\) when \(d \leq pr + 1\). Applications to synthetic and real-world datasets show that the proposed method leads to a smaller prediction error than other basis selection methods.
MSC:
62G08 Nonparametric regression and quantile regression
62K15 Factorial statistical designs
65D07 Numerical computation using splines
62P12 Applications of statistics to environmental and related topics
PDF BibTeX XML Cite
Full Text: DOI