×

Robust model averaging method based on LOF algorithm. (English) Zbl 07812199

Summary: Model averaging is a good alternative to model selection, which can deal with the uncertainty from model selection process and make full use of the information from various candidate models. However, most of the existing model averaging criteria do not consider the influence of outliers on the estimation procedures. The purpose of this paper is to develop a robust model averaging approach based on the local outlier factor (LOF) algorithm which can downweight the outliers in the covariates. Asymptotic optimality of the proposed robust model averaging estimator is derived under some regularity conditions. Further, we prove the consistency of the LOF-based weight estimator tending to the theoretically optimal weight vector. Numerical studies including Monte Carlo simulations and a real data example are provided to illustrate our proposed methodology.

MSC:

62-XX Statistics
91-XX Game theory, economics, finance, and other social and behavioral sciences
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] H. Akaike, Information theory and an extension of the maximum likelihood principle, In: Proceedings of the 2nd International Symposium on Information Theory 1973, 267-281. · Zbl 0283.62006
[2] M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander, LOF: Identifying density-based local outliers, In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data 2000, 93-104.
[3] S. T. Buckland, K. P. Burnham, and N. H. Augustin, Model selection: An integral part of inference, Biometrics 53 (1997), 603-618. · Zbl 0885.62118
[4] P. Burman and D. Nolan, A general Akaike-type criterion for model selection in robust regression, Biometrika 82 (1995), 877-886. · Zbl 0878.62047
[5] L. Chang, S. Roberts, and A. Welsh, Robust lasso regression using Tukey’s biweight criterion, Technometrics 60 (2018), 36-47.
[6] J. Chen, D. G. Li, O. Linton, and Z. D. Lu, Semiparametric ultra-high dimensional model averaging of nonlinear dynamic time series, J. Amer. Statist. Assoc. 113 (2018), 919-932. · Zbl 1398.62225
[7] C. W. Coakley and T. P. Hettamansperger, A bounded influence, high breakdown, effi-cient regression estimator, J. Amer. Statist. Assoc. 88 (1993), 872-880. · Zbl 0783.62024
[8] D. Draper, Assessment and propagation of model uncertainty (with discussion). J. Roy. Statist. Soc. Ser. B, 57 (1995), 45-97. · Zbl 0812.62001
[9] J. Q. Fan and H. Peng, On nonconcave penalized likelihood with diverging number of parameters, Ann. Statist. 32 (2004), 928-961. · Zbl 1092.62031
[10] Y. Gao, X. Y. Zhang, S. Y. Wang, and G. H. Zou, Model averaging based on leave-subject-out cross-validation, J. Econometrics 192 (2016), 139-151. · Zbl 1419.62084
[11] Y. F. Guo and Z. H. Li, Outlier robust model averaging based on S p criterion, Stat 10 (2021), 1-10.
[12] B. E. Hansen, Least squares model averaging, Econometrica 75 (2007), 1175-1189. · Zbl 1133.91051
[13] B. E. Hansen and J. S. Racine, Jackknife model averaging, J. Econometrics 167 (2012), 38-46. · Zbl 1441.62721
[14] N. L. Hjort and G. Claeskens, Frequentist model average estimators, J. Amer. Statist. Assoc. 98 (2003), 879-899. · Zbl 1047.62003
[15] J. A. Hoeting, D. Madigan, A. E. Raftery, and C. T. Volinsky, Bayesian model averaging: A tutorial, Statist. Sci. 14 (1999), 382-401.
[16] D. Kashid and S. Kulkarni, A more general criterion for subset selection in multiple linear regression, Comm. Statist. Theory Methods 31 (2002), 795-811. · Zbl 1075.62582
[17] J. L. Li, J. Lv, A. T. K. Wan, and J. Liao, AdaBoost semiparametric model averaging prediction for multiple categories, J. Amer. Statist. Assoc. 117 (2022), 495-509. · Zbl 1506.62329
[18] G. R. Li, H. Peng, and L. X. Zhu, Nonconcave penalized M-estimation with a diverging number of parameters, Statist. Sinica 21 (2011), 391-419. · Zbl 1206.62036
[19] X. M. Li, G. H. Zou, X. Y. Zhang, and S. W. Zhao, Least squares model averaging based on generalized cross validation, Acta Math. Appl. Sin. Engl. Ser. 37 (2021), 495-509. · Zbl 1471.62419
[20] H. Liang, G. H. Zou, A. T. K. Wan, and X. Y. Zhang, Optimal weight choice for frequen-tist model average estimators, J. Amer. Statist. Assoc. 106 (2011), 1053-1066. · Zbl 1229.62090
[21] X. Lu and L. Su, Jackknife model averaging for quantile regressions, J. Econometrics 188 2015, 40-58. · Zbl 1337.62080
[22] C. L. Mallows, Some comments on C p , Technometrics 15 (1973), 661-675. · Zbl 0269.62061
[23] B. R. Moulton, A Bayesian approach to regression selection and estimation, with applica-tion to a price index for radio services, J. Econometrics 49 (1991), 169-193.
[24] E. Ronchetti and R. G. Staudte, A robust version of Mallows’s C p , J. Amer. Statist. Assoc. 89 (1994), 550-559. · Zbl 0803.62026
[25] M. Salibian-Barrera and S. Van Aelst, Robust model selection using fast and robust boot-strap, Comput. Statist. Data Anal. 52 (2008), 5121-5135. · Zbl 1452.62509
[26] D. G. Simpson, D. Ruppert, and R. J. Carroll, On one-step GM estimates and stability of inferences in linear regression, J. Amer. Statist. Assoc. 87 (1992), 439-450. · Zbl 0781.62104
[27] G. Schwarz, Estimating the dimension of a model, Ann. Statist. 6 (1978), 461-464. · Zbl 0379.62005
[28] A. T. K. Wan, X. Y. Zhang, and G. H. Zou, Least squares model averaging by Mallows criterion, J. Econometrics 156 (2010), 277-283. · Zbl 1431.62291
[29] M. M. Wang, X. Y. Zhang, A. T. K. Wan, K. You, and G. H. Zou, Jackknife model av-eraging for high-dimensional quantile regression, Biometrics, 2021, https://doi.org/ 10.1111/biom.13574. · Zbl 1522.62250
[30] H. X. J. Wang, Z. Y. Zhu, and J. H. Zhou, Quantile regression in partially linear varying coefficient models, Ann. Statist. 37 (2009), 3841-3866. · Zbl 1191.62077
[31] L. Wang, Y. C. Wu, and R. Z. Li, Quantile regression for analyzing heterogeneity in ultra-high dimension, J. Amer. Statist. Assoc. 107 (2012), 214-222. · Zbl 1328.62468
[32] M. M. Wang and G. H. Zou, Outlier-robust model averaging approach by Mallows-type criterion, arxiv: 1910.12210, (2019).
[33] P. Whittle, Bounds for the moments of linear and quadratic forms in independent variables, Theory Probab. Appl. 5 (1960), 302-305. · Zbl 0101.12003
[34] X. Y. Zhang, J. M. Chiou, and Y. Y. Ma, Functional prediction through averaging esti-mated functional linear regression models, Biometrika 105 (2018), 945-962. · Zbl 1506.62548
[35] X. Y. Zhang, D. L. Yu, G. H. Zou, and H. Liang, Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects model, J. Amer. Statist. Assoc. 111 (2016), 1775-1790.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.