zbMATH — the first resource for mathematics

Weak convergence of \(k\)-NN density and regression estimators with varying \(k\) and applications. (English) Zbl 0643.62027
Let \((X_ i,Z_ i)\), \(i\geq 1\), be independent, two-dimensional random vectors distributed as (X,Z), where X has marginal distribution function F with density function f and where \(\mu (x)=E(Z| X=x)\) is the regression function of Z at \(X=x\). For fixed x, set \(Y_ i=| X_ i- x|\), let \(Y_{n1}\leq...\leq Y_{nn}\) denote the order statistics of \(Y_ 1,...,Y_ n\), and let \(Z_{n1},...,Z_{nn}\) be the induced order statistics in \((Y_ 1,Z_ 1),...,(Y_ n,Z_ n)\), i.e., \(Z_{ni}=Z_ j\) if \(Y_{ni}=Y_ j.\)
The k-nearest neighbor (k-NN) estimator of f(x) corresponding to the uniform kernel, i.e., \(f_{nk}(x)=(k-1)/(2nY_{nk})\), and the k-NN estimator of \(\mu\) (x) with uniform weights, i.e., \(\mu_{nk}(x)=k^{- 1}\sum^{k}_{j=1}Z_{nj}\), for fixed x and k varying in an appropriate range, are transformed into continuous time stochastic processes by setting \[ T_ n(t)=f_{n,[n^{4/5}t]}(x),\quad S_ n(t)=\mu_{n,[n^{4/5}t]}(x),\quad 0<a\leq t\leq b<\infty. \] Under the usual second-order smoothness conditions, it is shown that the two processes \[ \{n^{2/5}[T_ n(t)-f(x)],\quad a\leq t\leq b\},\quad \{n^{2/5}[S_ n(t)-\mu (x)],\quad a\leq t\leq b\} \] have a common limiting structure as the sample size n tends to infinity. These results lead to asymptotic linear models in which BLUE’s and suitably biased linear combinations of k-NN estimators with varying k are considered.
Reviewer: E.Häusler

62G05 Nonparametric estimation
62J02 General nonlinear regression
60F17 Functional limit theorems; invariance principles
62G30 Order statistics; empirical distribution functions
Full Text: DOI