zbMATH — the first resource for mathematics

Examples
Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

Operators
a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
Fields
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
Modeling the variability of rankings. (English) Zbl 1200.62149
Summary: For better or for worse, rankings of institutions, such as universities, schools and hospitals, play an important role today in conveying information about relative performance. They inform policy decisions and budgets, and are often reported in the media. While overall rankings can vary markedly over relatively short time periods, it is not unusual to find that the ranks of a small number of “highly performing” institutions remain fixed, even when the data on which the rankings are based are extensively revised, and even when a large number of new institutions are added to the competition. We endeavor to model this phenomenon. In particular, we interpret as a random variable the value of the attribute on which the ranking should ideally be based. More precisely, if $p$ items are to be ranked then the true, but unobserved, attributes are taken to be values of $p$ independent and identically distributed variates. However, each attribute value is observed only with noise, and via a sample of size roughly equal to $n$, say. These noisy approximations to the true attributes are the quantities that are actually ranked. We show that, if the distribution of the true attributes is light-tailed (e.g., normal or exponential) then the number of institutions whose ranking is correct, even after recalculation using new data and even after many new institutions are added, is essentially fixed. Formally, $p$ is taken to be of order $n^C$ for any fixed $C > 0$, and the number of institutions whose ranking is reliable depends very little on $p$. On the other hand, cases where the number of reliable rankings increases significantly when new institutions are added are those for which the distribution of the true attributes is relatively heavy-tailed, for example, with tails that decay like $x - \alpha $ for some $\alpha > 0$. These properties and others are explored analytically, under general conditions. A numerical study links the results to outcomes for real-data problems.

MSC:
62P99Applications of statistics
65C60Computational problems in statistics
62G99Nonparametric inference
62G32Statistics of extreme values; tail inference
62E20Asymptotic distribution theory in statistics
62P25Applications of statistics to social sciences
Software:
corpor
WorldCat.org
Full Text: DOI arXiv
References:
[1] Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. and Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96 6745-6750.
[2] Amosova, N. N. (1972). Limit theorems for the probabilities of moderate deviations. Vestnik Leningrad. Univ. No. 13 Mat. Meh. Astronom. Vyp. 5-14, 148.
[3] Barker, L. E., Smith, P. J., Gerzoff, R. B., Luman, E. T., McCauley, M. M. and Strine, T. W. (2005). Ranking states’ immunization coverage: An example from the National Immunization Survey. Stat. Med. 24 605-613. · doi:10.1002/sim.2039
[4] Brijs, T., Van Den Bossche, F., Wets, G. and Karlis, D. (2006). A model for identifying and ranking dangerous accident locations: A case study in Flanders. Statist. Neerlandica 60 457-476. · Zbl 1115.62122 · doi:10.1111/j.1467-9574.2006.00341.x
[5] Brijs, T., Karlis, D., Van Den Bossche, F. and Wets, G. (2007). A Bayesian model for ranking hazardous road sites. J. Roy. Statist. Soc. Ser. A 170 1001-1017. · doi:10.1111/j.1467-985X.2007.00486.x
[6] Cesário, L. C. and Barreto, M. C. M. (2003). Study of the performance of bootstrap confidence intervals for the mean of a normal distribution using perfectly ranked set sampling. Rev. Mat. Estatíst. 21 7-20.
[7] Chen, H., Stansy, E. A. and Wolfe, D. A. (2006). An empirical assessment of ranking accuracy in ranked set sampling. Comput. Statist. Data Anal. 51 1411-1419. · Zbl 1157.62316 · doi:10.1016/j.csda.2006.07.018
[8] Corain, L. and Salmaso, L. (2007). A non-parametric method for defining a global preference ranking of industrial products. J. Appl. Statist. 34 203-216. · Zbl 1119.62389 · doi:10.1080/02664760600995122
[9] Goldstein, H. and Spiegelhalter, D. J. (1996). League tables and their limitations: Statistical issues in comparisons of institutional performance. J. Roy. Statist. Soc. Ser. A 159 385-443.
[10] Hall, P. and Miller, H. (2009). Using the bootstrap to quantify the authority of an empirical ranking. Ann. Statist. 37 3929-3959. · Zbl 1191.62080 · doi:10.1214/09-AOS699
[11] Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. Ann. Statist. 3 1163-1174. · Zbl 0323.62033 · doi:10.1214/aos/1176343247
[12] Hui, T. P., Modarres, R. and Zheng, G. (2005). Bootstrap confidence interval estimation of mean via ranked set sampling linear regression. J. Stat. Comput. Simul. 75 543-553. · Zbl 1067.62012 · doi:10.1080/00949650412331286124
[13] Joe, H. (2000). Inequalities for random utility models, with applications to ranking and subset choice data. Methodol. Comput. Appl. Probab. 2 359-372. · Zbl 0984.60027 · doi:10.1023/A:1010058117460
[14] Joe, H. (2001). Multivariate extreme value distributions and coverage of ranking probabilities. J. Math. Psych. 45 180-188. · Zbl 0988.62031 · doi:10.1006/jmps.1991.1294
[15] Langford, I. H. and Leyland, A. H. (1996). Discussion of “League tables and their limitations: Statistical issues in comparisons of institutional performance” by Goldstein and Spiegelhalter. J. Roy. Statist. Soc. Ser. A 159 427-428.
[16] McHale, I. and Scarf, P. (2005). Ranking football players. Significance 2 54-57. · doi:10.1111/j.1740-9713.2005.00091.x
[17] Mease, D. (2003). A penalized maximum likelihood approach for the ranking of college football teams independent of victory margins. Amer. Statist. 57 241-248. · doi:10.1198/0003130032396
[18] Mukherjee, S. N., Sykacek, P., Roberts, S. J. and Gurr, S. J. (2003). Gene ranking using bootstrapped p -values. Sigkdd Explorations 5 14-18.
[19] Murphy, T. B. and Martin, D. (2003). Mixtures of distance-based models for ranking data. Comput. Statist. Data Anal. 41 645-655. · Zbl 05361760
[20] Nordberg, L. (2006). On the reliability of performance rankings. In Festschrift for Tarmo Pukkila on His 60th Birthday (E. P. Liski, J. Isotalo, J. Niemelä and G. P. H. Styan, eds.) 205-216. Univ. Tampere, Tampere, Finland. · Zbl 1145.62411
[21] Opgen-Rhein, R. and Strimmer, K. (2007). Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat. Appl. Genet. Mol. Biol. 6 Art. 9, 20pp. (electronic). · Zbl 1166.62361 · doi:10.2202/1544-6115.1252 · http://www.bepress.com/sagmb/vol6/iss1/art9
[22] Quevedo, J. R., Bahamonde, A. and Luaces, O. (2007). A simple and efficient method for variable ranking according to their usefulness for learning. Comput. Statist. Data Anal. 52 578-595. · Zbl 05560179
[23] Rényi, A. (1953). On the theory of order statistics. Acta Math. Acad. Sci. Hungar. 4 191-232. · Zbl 0052.14202 · doi:10.1007/BF02127580
[24] Rubin, H. and Sethuraman, J. (1965). Probabilities of moderate deviations. Sankhyā Ser. A 27 325-346. · Zbl 0178.53802
[25] Taconeli, C. A. and Barreto, M. C. M. (2005). Evaluation of a bootstrap confidence interval approach in perfectly ranked set sampling. Rev. Mat. Estatíst. 23 33-53.
[26] Xie, M., Singh, K. and Zhang, C. H. (2009). Confidence intervals for population ranks in the presence of ties and near ties. J. Amer. Statist. Assoc. 104 775-787. · Zbl 06441096 · doi:10.1198/jasa.2009.0142