×

Approximating the conditional density given large observed values via a multivariate extremes framework, with application to environmental data. (English) Zbl 1257.62118

Summary: Phenomena such as air pollution levels are of greatest interest when observations are large, but standard prediction methods are not specifically designed for large observations. We propose a method, rooted in extreme value theory, which approximates the conditional distribution of an unobserved component of a random vector given large observed values. Specifically, for \(\mathbf {Z}=(Z_{1},\dots,Z_{d})^{T}\) and \(\mathbf {Z}_{-d}=(Z_{1},\dots,Z_{d-1})^{T}\), the method approximates the conditional distribution of \([Z_{d}|\mathbf {Z}_{-d}=\mathbf {z}_{-d}]\) when \(\parallel \mathbf {z}_{-d}\parallel >r_{\ast}\). The approach is based on the assumption that \(\mathbf {Z}\) is a multivariate regularly varying random vector of dimension \(d\). The conditional distribution approximation relies on knowledge of the angular measure of \(\mathbf {Z}\), which provides explicit structure for dependence in the distribution’s tail.
As the method produces a predictive distribution rather than just a point predictor, one can answer any question posed about the quantity being predicted, and, in particular, one can assess how well the extreme behavior is represented. Using a fitted model for the angular measure, we apply our method to nitrogen dioxide measurements in metropolitan Washington DC. We obtain a predictive distribution for the air pollutant at a location given the air pollutant’s measurements at four nearby locations and given that the norm of the vector of the observed measurements is large.

MSC:

62P12 Applications of statistics to environmental and related topics
62H10 Multivariate distribution of statistics
62G32 Statistics of extreme values; tail inference

Software:

evd; ismev
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Ballani, F. and Schlather, M. (2011). A construction principle for multivariate extreme value distributions. Biometrika 98 633-645. · Zbl 1230.62073 · doi:10.1093/biomet/asr034
[2] Beirlant, J., Goegebeur, Y., Segers, J., Teugels, J., Waal, D. D. and Ferro, C. (2004). Statistics of Extremes : Theory and Applications . Wiley, New York. · Zbl 1070.62036 · doi:10.1002/0470012382
[3] Boldi, M. O. and Davison, A. C. (2007). A mixture model for multivariate extremes. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 217-229. · Zbl 1120.62030 · doi:10.1111/j.1467-9868.2007.00585.x
[4] Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values . Springer, London. · Zbl 0980.62043
[5] Coles, S. G. and Tawn, J. A. (1991). Modelling extreme multivariate events. J. Roy. Statist. Soc. Ser. B 53 377-392. · Zbl 0800.60020
[6] Cooley, D., Davis, R. A. and Naveau, P. (2010). The pairwise beta distribution: A flexible parametric multivariate model for extremes. J. Multivariate Anal. 101 2103-2117. · Zbl 1203.62104 · doi:10.1016/j.jmva.2010.04.007
[7] Craigmile, P. F., Cressie, N., Santner, T. J. and Rao, Y. (2006). A loss function approach to identifying environmental exceedances. Extremes 8 143-159. · Zbl 1115.62117 · doi:10.1007/s10687-006-7964-y
[8] Cressie, N. A. C. (1993). Statistics for Spatial Data . Wiley, New York. · Zbl 0799.62002
[9] Davis, R. A. and Resnick, S. I. (1989). Basic properties and prediction of max-ARMA processes. Adv. in Appl. Probab. 21 781-803. · Zbl 0716.62098 · doi:10.2307/1427767
[10] Davis, R. A. and Resnick, S. I. (1993). Prediction of stationary max-stable processes. Ann. Appl. Probab. 3 497-525. · Zbl 0779.60048 · doi:10.1214/aoap/1177005435
[11] de Haan, L. and Ferreira, A. (2006). Extreme Value Theory : An Introduction . Springer, New York. · Zbl 1101.62002
[12] EPA. (2010). Fact sheet: Final revisions to the national ambient air quality standards for nitrogen dioxide. Available at .
[13] Fisher, R. A. and Tippett, L. H. C. (1928). Limiting forms of the frequency distribution of the larges or smallest members of a sample. Math. Proc. Cambridge Philos. Soc. 24 180-190. · JFM 54.0560.05
[14] Friederichs, P. and Hense, A. (2007). Statistical downscaling of extreme precipitation events using censored quantile regression. Monthly Weather Review 135 2365-2378.
[15] Gnedenko, B. (1943). Sur la distribution limite du terme maximum d’une série aléatoire. Ann. of Math. (2) 44 423-453. · Zbl 0063.01643 · doi:10.2307/1968974
[16] Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 243-268. · Zbl 1120.62074 · doi:10.1111/j.1467-9868.2007.00587.x
[17] Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359-378. · Zbl 1284.62093 · doi:10.1198/016214506000001437
[18] Gneiting, T. and Ranjan, R. (2011). Comparing density forecasts using threshold- and quantile-weighted scoring rules. J. Bus. Econom. Statist. 29 411-422. · Zbl 1219.91108 · doi:10.1198/jbes.2010.08110
[19] Gumbel, É. J. (1960). Distributions des valeurs extrêmes en plusieurs dimensions. Publ. Inst. Statist. Univ. Paris 9 171-173. · Zbl 0093.15303
[20] Hogg, R., McKean, J. and Craig, A. (2005). Introduction to Mathematical Statistics , 6th ed. Prentice Hall, Upper Saddle River, NJ.
[21] Joe, H. (1990). Families of min-stable multivariate exponential and multivariate extreme value distributions. Statist. Probab. Lett. 9 75-81. · Zbl 0686.62035 · doi:10.1016/0167-7152(90)90098-R
[22] Meyer, M. C. (2008). Inference using shape-restricted regression splines. Ann. Appl. Stat. 2 1013-1033. · Zbl 1149.62033 · doi:10.1214/08-AOAS167
[23] Resnick, S. I. (1987). Extreme Values , Regular Variation , and Point Processes . Springer, New York. · Zbl 0633.60001
[24] Resnick, S. (2002). Hidden regular variation, second order regular variation and asymptotic independence. Extremes 5 303-336. · Zbl 1035.60053 · doi:10.1023/A:1025148622954
[25] Resnick, S. I. (2007). Heavy-Tail Phenomena : Probabilistic And Statistical Modeling . Springer, New York. · Zbl 1152.62029 · doi:10.1007/978-0-387-45024-7
[26] Rootzén, H. and Tajvidi, N. (2006). Multivariate generalized Pareto distributions. Bernoulli 12 917-930. · Zbl 1134.62028 · doi:10.3150/bj/1161614952
[27] Schabenberger, O. and Gotway, C. A. (2005). Statistical Methods for Spatial Data Analysis . Chapman & Hall/CRC, Boca Raton, FL. · Zbl 1068.62096
[28] Song, D. and Gupta, A. K. (1997). \(L_{p}\)-norm uniform distribution. Proc. Amer. Math. Soc. 125 595-601. · Zbl 0866.62026 · doi:10.1090/S0002-9939-97-03900-2
[29] Stephenson, A. G. (2002). evd: Extreme value distributions. R News 2 31-32.
[30] Tawn, J. (1990). Modeling multivariate extreme value distributions. Biometrika 75 245-253. · Zbl 0716.62051 · doi:10.1093/biomet/77.2.245
[31] Wang, Y. and Stoev, S. A. (2011). Conditional sampling for spectrally discrete max-stable random fields. Adv. in Appl. Probab. 43 461-483. · Zbl 1225.60085 · doi:10.1239/aap/1308662488
[32] Wilks, D. (2006). Statistical Methods in the Atmospheric Sciences : An Introduction , 2nd ed. Academic Press, San Diego.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.