×

Detecting influential data points for the Hill estimator in Pareto-type distributions. (English) Zbl 1471.62093

Summary: Pareto-type distributions are extreme value distributions for which the extreme value index \(\gamma > 0\). Classical estimators for \(\gamma > 0\), like the Hill estimator, tend to overestimate this parameter in the presence of outliers. The empirical influence function plot, which displays the influence that each data point has on the Hill estimator, is introduced. To avoid a masking effect, the empirical influence function is based on a new robust GLM estimator for \(\gamma\). This robust GLM estimator is used to determine high quantiles of the data generating distribution, allowing to flag data points as unusually large if they exceed this high quantile.

MSC:

62-08 Computational methods for problems pertaining to statistics
62G32 Statistics of extreme values; tail inference
62G35 Nonparametric robustness
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Beirlant, J.; Dierckx, G.; Goegebeur, Y.; Matthys, G., Tail index estimation and an exponential regression model, Extremes, 2, 177-200, (1999) · Zbl 0947.62034
[2] Beirlant, J.; Dierckx, G.; Guillou, A.; Staˇricaˇ, C., On exponential representations of log-spacings of extreme order statistics, Extremes, 5, 2, 157-180, (2002) · Zbl 1036.62040
[3] Beirlant, J.; Goegebeur, Y.; Teugels, J. L.; Segers, J., Statistics of extremes: theory and applications, (2004), Wiley Chichester
[4] Brazauskas, V.; Serfling, R. J., Robust and efficient estimation of the tail index of a single-parameter Pareto distribution, North American Actuarial Journal, 4, 12-27, (2000) · Zbl 1083.62505
[5] Cai, J.-J.; Einmahl, J.; de Haan, L., Estimation of extreme risk regions under multivariate regular variation, The Annals of Statistics, 39, 1803-1826, (2011) · Zbl 1221.62075
[6] Cantoni, E.; Ronchetti, E., Robust inference for generalized linear models, Journal of the American Statistical Association, 96, 1022-1030, (2001) · Zbl 1072.62610
[7] Cantoni, E.; Ronchetti, E., A robust approach for skewed and heavy-tailed outcomes in the analysis of health care expenditure, Journal of Health Economics, 25, 198-213, (2006)
[8] Debruyne, M.; Hubert, M.; Van Horebeek, J., Detecting influential observations in kernel PCA, Computational Statistics and Data Analysis, 54, 3007-3019, (2010) · Zbl 1284.62046
[9] Dell’Aquila, R.; Embrechts, P., Extremes and robustness: a contradiction?, Financial Markets and Portfolio Management, 20, 103-118, (2006)
[10] Dierckx, G.; Beirlant, J.; De Waal, D.; Guillou, A., A new estimation method for Weibull-type tails based on the mean excess function, Journal of Statistical Planning and Inference, 139, 1905-1920, (2009) · Zbl 1161.62028
[11] Drees, H.; Kaufmann, E., Selecting the optimal sample fraction in univariate extreme value estimation, Stochastic Processes and Their Applications, 75, 2, 149-172, (1998) · Zbl 0926.62013
[12] DuMouchel, W. H., Estimating the stable index \(\alpha\) in order to measure tail thickness: a critique, The Annals of Statistics, 11, 4, 1019-1031, (1983) · Zbl 0547.62022
[13] Dupuis, D. J.; Field, C. A., Robust estimation of extremes, The Canadian Journal of Statistics, 26, 199-215, (1998) · Zbl 0915.62017
[14] Dupuis, D. J.; Morgenthaler, S., Robust weighted likelihood estimators with an application to bivariate extreme value problems, The Canadian Journal of Statistics, 30, 1, 17-36, (2002) · Zbl 1003.62016
[15] Dupuis, D. J.; Victoria-Feser, M.-P., A robust prediction error criterion for Pareto modelling of upper tails, The Canadian Journal of Statistics, 34, 639-658, (2006) · Zbl 1115.62056
[16] Ferreira, M.; Canto e Castro, L., Modeling rare events through a \(p\)RARMAX process, Journal of Statistical Planning and Inference, 140, 3552-3566, (2010) · Zbl 1372.62006
[17] Fisher, R.; Tippett, L., On the estimation of the frequency distributions of the largest or smallest members of a sample, Proceedings of the Cambridge Philosophical Society, 24, 180-190, (1928) · JFM 54.0560.05
[18] Fraga Alves, M. I.; Gomes, M. I.; de Haan, L., A new class of semi-parametric estimators of the second order parameter, Portugaliae Mathematica, 60, 2, 193-213, (2003) · Zbl 1042.62050
[19] Gnedenko, B., Sur la distribution limite du terme maximum d’une série aléatoire, Annals of Mathematics, 44, 423-453, (1943) · Zbl 0063.01643
[20] Gomes, M. I.; Martins, M. J.; Neves, M., Improving second order reduced bias extreme value index estimation, REVSTAT Statistical Journal, 5, 2, 177-207, (2007) · Zbl 1513.62089
[21] Gomes, M. I.; Oliveira, O., Maximum likelihood revisited under a semi-parametric context — estimation of the tail index, Journal of Statistical Computation and Simulation, 73, 4, 285-301, (2003) · Zbl 1046.62050
[22] Guillou, A.; Hall, P., A diagnostic for selecting the threshold in extreme value analysis, Journal of the Royal Statistical Society B., 63, 293-305, (2001) · Zbl 0979.62039
[23] Hampel, F.; Ronchetti, E.; Rousseeuw, P.; Stahel, W., Robust statistics: the approach based on influence functions, (1986), Wiley New York
[24] Heritier, S.; Cantoni, E.; Copt, S.; Victoria-Feser, M.-P., (Robust Methods in Biostatistics, Wiley Series in Probability and Statistics, (2009), John Wiley & Sons Ltd Chichester) · Zbl 1163.62085
[25] Hill, B., A simple general approach to inference about the tail of a distribution, The Annals of Statistics, 3, 1163-1174, (1975) · Zbl 0323.62033
[26] Hubert, M.; Vandervieren, E., An adjusted boxplot for skewed distributions, Computational Statistics and Data Analysis, 52, 12, 5186-5201, (2008) · Zbl 1452.62074
[27] Juárez, S. F.; Schucany, W. R., Robust and efficient estimation for the generalized Pareto distribution, Extremes, 7, 237-251, (2004) · Zbl 1091.62017
[28] Matthys, G.; Beirlant, J., Adaptive threshold selection in tail index estimation, (Embrechts, Paul, Extremes and Integrated Risk Management, (2000), UBS Warburg), 37-49
[29] McCullagh, P.; Nelder, J. A., Generalized linear models, (1983), Chapman and Hall London · Zbl 0588.62104
[30] McElroy, T.; Agnieszka, J., Tail index estimation in the presence of long-memory dynamics, Computational Statistics and Data Analysis, 56, 266-282, (2012) · Zbl 1318.62276
[31] Perret-Gentil, C.; Victoria-Feser, M.-P., (Robust Mean-variance Portfolio Selection, FAME Research Paper Series rp140, (2005), International Center for Financial Asset Management and Engineering)
[32] Pison, G.; Rousseeuw, P. J.; Filzmoser, P.; Croux, C., Robust factor analysis, Journal of Multivariate Analysis, 84, 145-172, (2003) · Zbl 1038.62055
[33] Pison, G.; Van Aelst, S., Diagnostic plots for robust multivariate methods, Journal of Computational and Graphical Statistics, 13, 310-329, (2004)
[34] Vandewalle, B.; Beirlant, J.; Christmann, A.; Hubert, M., A robust estimator for the tail index of Pareto-type distributions, Computational Statistics and Data Analysis, 51, 6252-6268, (2007) · Zbl 1445.62102
[35] Vandewalle, B.; Beirlant, J.; Hubert, M., A robust estimator of the tail index based on an exponential regression model, (Hubert, M.; Pison, G.; Struyf, A.; Van Aelst, S., Theory and Applications of Recent Robust Methods, (2004), Birkhauser Basel), 367-376, Statistics for industry and technology · Zbl 1088.62064
[36] Victoria-Feser, M.-P.; Ronchetti, E., Robust methods for personal-income distribution models, The Canadian Journal of Statistics, 22, 247-258, (1994) · Zbl 0801.62099
[37] Weissman, I., Estimation of parameters and large quantiles based on the \(k\) largest observations, Journal of the American Statistical Association, 73, 812-815, (1978) · Zbl 0397.62034
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.