×

zbMATH — the first resource for mathematics

On the \(L_p\) norms of kernel regression estimators for incomplete data with applications to classification. (English) Zbl 1397.62151
Summary: We consider kernel methods to construct nonparametric estimators of a regression function based on incomplete data. To tackle the presence of incomplete covariates, we employ Horvitz-Thompson-type inverse weighting techniques, where the weights are the selection probabilities. The unknown selection probabilities are themselves estimated using (1) kernel regression, when the functional form of these probabilities are completely unknown, and (2) the least-squares method, when the selection probabilities belong to a known class of candidate functions. To assess the overall performance of the proposed estimators, we establish exponential upper bounds on the \(L_p\) norms, \(1\leq p<\infty \), of our estimators; these bounds immediately yield various strong convergence results. We also apply our results to deal with the important problem of statistical classification with partially observed covariates.

MSC:
62G08 Nonparametric regression and quantile regression
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Software:
hgam; np
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Bernstein S (1946) The theory of probabilities. Gastehizdat Publishing House, Moscow
[2] Bravo, F, Semiparametric estimation with missing covariates, J Multivar Anal, 139, 329-346, (2015) · Zbl 1328.62196
[3] Chen, HY, Nonparametric and semiparametric models for missing covariates in parametric regression, J Am Stat Assoc, 99, 1176-1189, (2004) · Zbl 1112.62324
[4] Cheng, PE; Chu, CK, Kernel estimation of distribution functions and quantiles with missing data, Stat Sin, 6, 63-78, (1996) · Zbl 0839.62038
[5] Devroye, L, On the almost everywhere convergence of nonparametric regression function estimates, Ann Stat, 9, 1310-1319, (1981) · Zbl 0477.62025
[6] Devroye L, Györfi L, Lugosi G (1985) Nonparametric density estimation: the L1 view. Wiley, New York · Zbl 0546.62015
[7] Devroye, L; Krzyz̀ak, A, An equivalence theorem for \(L_1\) convergence of kernel regression estimate, J Stat Plan Inference, 23, 71-82, (1989) · Zbl 0686.62027
[8] Devroye, L; Wagner, T, On the \(L_1\) convergence of kernel estimators of regression functions with applications in discrimination, Z. Wahrsch. Verw. Gebiete, 51, 15-25, (1980) · Zbl 0396.62044
[9] Efromovich, S, Nonparametric regression with predictors missing at random, J Am Stat Assoc, 106, 306-319, (2012) · Zbl 1396.62078
[10] Faes C, Ormerod JT, Wand MP (2011) Variational Bayesian inference for parametric and nonparametric regression with missing data. J Am Stat Assoc 106(495):959-971 · Zbl 1229.62028
[11] Guo, X; Xu, W; Zhu, L, Multi-index regression models with missing covariates at random, J Multivar Anal, 123, 345-363, (2014) · Zbl 1278.62053
[12] Györfi L, Kohler M, Krzyz̀ak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer, New York · Zbl 1021.62024
[13] Hardle, W; Marron, J, Optimal bandwidth selection in nonparametric regression function estimation, Ann Stat, 13, 1465-1481, (1985) · Zbl 0594.62043
[14] Hirano, KI; Ridder, G, Efficient estimation of average treatment effects using the estimated propensity score, Econometrica, 71, 1161-1189, (2003) · Zbl 1152.62328
[15] Horvitz, DG; Thompson, DJ, A generalization of sampling without replacement from a finite universe, J Am Stat Assoc, 47, 663-685, (1952) · Zbl 0047.38301
[16] Hu, Y; Zhu, Q; Tian, M, An efficient technique of multiple imputation in nonparametric quantile regression, J Math Stat, 10, 30-44, (2014)
[17] Ibrahim, JG; Lipsitz, SR; Chen, MH, Missing covariates in generalized linear models when the missing data mechanism is non-ignorable, J R Stat Soc Ser B (Statistical Methodology), 61, 173-190, (1999) · Zbl 0917.62060
[18] Kohler, M; Krzyz̀ak, A; Walk, H, Strong consistency of automatic kernel regression estimates, Ann. Inst. Stat. Math., 55, 287-308, (2003) · Zbl 1049.62042
[19] Liang, H; Wang, S; Robins, J; Carroll, R, Estimation in partially linear models with missing covariates, J Am Stat Assoc, 99, 357-367, (2004) · Zbl 1117.62385
[20] Lipsitz, SR; Ibrahim, JG, A conditional model for incomplete covariates in parametric regression models, Biometrika, 83, 916-922, (1996) · Zbl 0885.62026
[21] Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley, New York
[22] Meier, L; Geer, S; Bühlmann, P, High-dimensional additive modeling, Ann Stat, 37, 3779-3821, (2009) · Zbl 1360.62186
[23] Mojirsheibani, M, Nonparametric curve estimation with missing data: a general empirical process approach, J Stat Plan Inference, 137, 2733-2758, (2007) · Zbl 1331.62221
[24] Mojirsheibani, M, Some results on classifier selection with missing covariates, Metrika, 75, 521-539, (2012) · Zbl 1300.62044
[25] Pollard D (1984) Convergence of stochastic processes. Springer, New York · Zbl 0544.60045
[26] Racine, J; Hayfield, T, Nonparametric econometrics: the np package, J Stat Softw, 27, 1-32, (2008)
[27] Racine, J; Li, Q, Nonparametric estimation of regression functions with both categorical and continuous data, J Econom, 119, 99-130, (2004) · Zbl 1337.62062
[28] Robins, JM; Rotnitzky, A; Zhao, LP, Estimation of regression coefficients when some regressors are not always observed, J Am Stat Assoc, 89, 846-866, (1994) · Zbl 0815.62043
[29] Sinha, S; Saha, KK; Wang, S, Semiparametric approach for non-monotone missing covariates in a parametric regression model, Biometrics, 70, 299-311, (2014) · Zbl 1419.62450
[30] Spiegelman, C; Sacks, J, Consistent window estimation in nonparametric regression, Ann Stat, 8, 240-246, (1980) · Zbl 0432.62066
[31] van Der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes with applications to statistics. Springer, New York · Zbl 0862.60002
[32] Walk, H, On cross-validation in kernel and partitioning regression estimation, Stat Probab Lett, 59, 113-123, (2002) · Zbl 1092.62530
[33] Walk H (2002b) Almost sure convergence properties of Nadaraya-Watson regression estimates. In: Modeling uncertainty. International Series of Operational Research and Management Science, vol 46. Kluwer Academic Publishing, Boston · Zbl 0594.62043
[34] Wang, L; Rotnitzky, A; Lin, X, Nonparametric regression with missing outcomes using weighted kernel estimating equations, J Am Stat Assoc, 105, 1135-1146, (2010) · Zbl 1390.62068
[35] Zhang, Z; Rockette, HE, On maximum likelihood estimation in parametric regression with missing covariates, J Stat Plan Inference, 134, 206-223, (2005) · Zbl 1066.62038
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.