×

zbMATH — the first resource for mathematics

Cellwise robust M regression. (English) Zbl 07202079
Summary: The cellwise robust M regression estimator is introduced as the first estimator of its kind that intrinsically yields both a map of cellwise outliers consistent with the linear model, and a vector of regression coefficients that is robust against vertical outliers and leverage points. As a by-product, the method yields a weighted and imputed data set that contains estimates of what the values in cellwise outliers would need to amount to if they had fit the model. The method is illustrated to be equally robust as its casewise counterpart, MM regression. The cellwise regression method discards less information than any casewise robust estimator. Therefore, predictive power can be expected to be at least as good as casewise alternatives. These results are corroborated in a simulation study. Moreover, while the simulations show that predictive performance is at least on par with casewise methods if not better, an application to a data set consisting of compositions of Swiss nutrients, shows that in individual cases, CRM can achieve a much higher predictive accuracy compared to MM regression.
MSC:
62 Statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Debruyne, M.; Höppner, S.; Serneels, S.; Verdonck, T., Outlyingness: which variables contribute most?, Stat. Comput., 29, 4, 707-723 (2019) · Zbl 1430.62095
[2] Fritz, H.; Filzmoser, P.; Croux, C., A comparison of algorithms for the multivariate l1-median, Comput. Statist., 27, 3, 393-410 (2012) · Zbl 1304.65034
[3] Gauss, C. F., Theoria combinationis observationum erroribus minimis obnoxiae, Werke, 4, 1-93 (1826)
[4] Green, P. J., Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives, J. R. Stat. Soc. Ser. B Stat. Methodol., 46, 2, 149-170 (1984) · Zbl 0555.62028
[5] Guerard, J. B., Investing in global markets: big data and applications of robust regression, Frontiers Appl. Math. Stat., 1, 14 (2016)
[6] Hampel, F. R.; Ronchetti, E. M.; Rousseeuw, P. J.; Stahel, W., Robust Statistics. The Approach Based on Influence Functions (1986), Wiley and Sons: Wiley and Sons New York · Zbl 0593.62027
[7] Hoffmann, I.; Filzmoser, P.; Serneels, S.; Varmuza, K., Sparse and robust PLS for binary classification, J. Chemom., 30, 4, 153-162 (2016)
[8] Hoffmann, I.; Serneels, S.; Filzmoser, P.; Croux, C., Sparse partial robust M regression, Chemometr. Intell. Lab. Syst., 149, 50-59 (2015)
[9] Hu, M.; Zhang, W. M.; Zhong, M., Robust regression and its application in absolute gravimeters, Rev. Sci. Instrum., 88, 5, Article 054501 pp. (2017)
[10] Huber, P. J.; Ronchetti, E. M., Robust Statistics (2009), John Wiley & Sons: John Wiley & Sons Hoboken, NJ · Zbl 1276.62022
[11] Koller, M.; Stahel, W., Sharpening Wald-type inference in robust regression for small samples, Comput. Statist. Data Anal., 55, 8, 2504-2515 (2011) · Zbl 06917735
[12] Leoni, P.; Segaert, P.; Serneels, S.; Verdonck, T., Multivariate constrained robust M-regression for shaping forward curves in electricity markets, J. Futures Mark., 38, 11, 1391-1406 (2018)
[13] Maechler, M.; Rousseeuw, P. J.; Croux, C.; Todorov, V.; Ruckstuhl, A.; Salibian-Barrera, M.; Verbeke, T.; Koller, M.; Conceicao, E. L.T.; Anna di Palma, M., Robustbase: basic robust statistics (2018), R package version 0.93-3. URL http://robustbase.r-forge.r-project.org/
[14] Maronna, R.; Martin, D.; Yohai, V., Robust Statistics: Theory and Methods (2006), John Wiley & Sons: John Wiley & Sons Chichester · Zbl 1094.62040
[15] Maronna, R. A.; Martin, R. D.; Yohai, V. J.; Salibián-Barrera, M., Robust Statistics: Theory and Methods (with R) (2019), John Wiley & Sons: John Wiley & Sons Chichester · Zbl 1409.62009
[16] Nährwerttabelle, Infanger E. Schweizer, Schweizerische Gesellschaft für Ernährung. SGE, Bern (2015), URL http://www.sge-ssn.ch/shop/produkt/schweizer-naehrwerttabelle/
[17] Öllerer, V.; Alfons, A.; Croux, C., The shooting S-estimator for robust regression, Comput. Statist., 31, 3, 829-844 (2016) · Zbl 1347.65027
[18] Rousseeuw, P. J., Least median of squares regression, J. Amer. Statist. Assoc., 79, 871-880 (1984) · Zbl 0547.62046
[19] Rousseeuw, P.; Croux, C., Alternatives to the median absolute deviation, J. Amer. Statist. Assoc., 88, 424, 1273-1283 (1993) · Zbl 0792.62025
[20] Rousseeuw, P. J.; Leroy, A. M., Robust Regression and Outlier Detection (1987), Wiley and Sons: Wiley and Sons New York · Zbl 0711.62030
[21] Rousseeuw, P. J.; Van Driessen, K., Computing LTS regression for large data sets, Data Min. Knowl. Discov., 12, 29-45 (2006)
[22] Rousseeuw, P. J.; Vanden Bossche, W., Detecting deviating data cells, Technometrics, 60, 2, 135-145 (2018)
[23] Rousseeuw, P. J.; Yohai, V., Robust regression by means of S-estimators, (Robust and Nonlinear Time Series Analysis (1984), Springer), 256-272 · Zbl 0567.62027
[24] Salibián-Barrera, M.; Van Aelst, S.; Willems, G., Fast and robust bootstrap, Stat. Methods Appl., 17, 41-71 (2008) · Zbl 1367.62084
[25] Serneels, S.; Croux, C.; Filzmoser, P.; Van Espen, P. J., Partial robust M-regression, Chemometr. Intell. Lab. Syst., 79, 1-2, 55-64 (2005)
[26] Serneels, S.; De Nolf, E.; Van Espen, P. J., Spatial sign preprocessing: a simple way to impart moderate robustness to multivariate estimators, J. Chem. Inf. Model., 46, 3, 1402-1409 (2006)
[27] Yohai, V. J., High breakdown-point and high efficiency estimates for regression, Ann. Statist., 15, 642-665 (1987) · Zbl 0624.62037
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.