×

Modeling heterogeneity in ranked responses by nonparametric maximum likelihood: How do Europeans get their scientific knowledge? (English) Zbl 1220.62158

Summary: This paper is motivated by a Eurobarometer survey on science knowledge. As part of the survey, respondents were asked to rank sources of science information in order of importance. The official statistical analysis of these data however failed to use the complete ranking information. We instead propose a method which treats ranked data as a set of paired comparisons which places the problem in the standard framework of generalized linear models and also allows respondent covariates to be incorporated.
An extension is proposed to allow for heterogeneity in the ranked responses. The resulting model uses a nonparametric formulation of the random effects structure, fitted using the EM algorithm. Each mass point is multivalued, with a parameter for each item. The resultant model is equivalent to a covariate latent class model, where the latent class profiles are provided by the mass point components and the covariates act on the class profiles. This provides an alternative interpretation of the fitted model. The approach is also suitable for paired comparison data.

MSC:

62P25 Applications of statistics to social sciences
62J15 Paired and multiple comparisons; multiple testing
62G05 Nonparametric estimation
65C60 Computational problems in statistics (MSC2010)
62F07 Statistical ranking and selection procedures

Software:

npmlreg; EMMIX; prefmod; R

References:

[1] Aitkin, M. (1994). An EM algorithm for overdispersion in generalised linear models. In Proceedings of the 9th International Workshop on Statistical Modelling (J. Hinde, ed.) 1-8. Exeter University.
[2] Aitkin, M. (1996). A general maximum likelihood analysis of overdispersion in generalized linear models. Statist. Comput. 6 251-262.
[3] Aitkin, M. and Aitkin, I. (1996). A hybrid EM/Gauss-Newton algorithm for maximum likelihood in mixture distributions. Statist. Comput. 6 127-130.
[4] Böckenholt, U. (2001a). Hierarchical modelling of paired comparison data. Psychological Methods 6 49-66.
[5] Böckenholt, U. (2001b). Mixed-effects analyses of rank ordered data. Psychometrika 66 45-62. · Zbl 1293.62236 · doi:10.1007/BF02295731
[6] Bradley, R. and Terry, M. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324-345. · Zbl 0047.12903
[7] Busse, L., Orbanz, P. and Buhmann, J. (2007). Cluster analysis of heterogeneous rank data. In Proceedings of the 24th International Conference on Machine Learning 113-120. ACM Press, New York.
[8] Chapman, R. and Staelin, R. (1982). Exploiting rank ordered choice set data within the stochastic utility model. J. Marketing Res. 19 288-301.
[9] Christensen, T. (2001). Eurobarometer 55.2: Europeans, science and technology. Technical report, European Opinion Research Group, Commission of the European Communities, Brussels.
[10] Coull, B. and Agresti, A. (2000). Random effects modeling of multiple binomial responses using the multivariate binomial logit-normal distribution. Biometrics 56 73-80. · Zbl 1060.62533 · doi:10.1111/j.0006-341X.2000.00073.x
[11] Critchlow, D. and Fligner, M. (1991). Paired comparison, triple comparison, and ranking experiments as generalized linear models, and their implementation in GLIM. Psychometrika 56 517-533. · Zbl 0850.62574 · doi:10.1007/BF02294488
[12] Critchlow, D. and Fligner, M. (1993). Ranking models with item variables. In Probability Models and Statistical Analyses for Ranking Data (M. Fligner and J. Verducci, eds.). Lecture Notes in Statistics 80 1-19. Springer, New York. · Zbl 0766.62011
[13] Croon, M. (1989). Latent class models for the analysis of rankings. In New Developments in Psychological Choice Modelling (G. De Soete and J. S. Klauer, ed.) 99-121. Elsevier, Amsterdam.
[14] Dabic, M. and Hatzinger, R. (2009). Zielgruppenadäquate Abläufe in Konfigurationssystemen-eine empirische studie im automobilmarkt: Das paarvergleichs-pattern-modell füer partial rankings. In Präferenzanalyse Mit R (R. Hatzinger, R. Dittrich and T. Salzberger, eds.) 119-150. Facultas, Wien.
[15] D’Elia, A. and Piccolo, D. (2005). A mixture model for preferences data analysis. Comput. Statist. Data Anal. 49 917-934. · Zbl 1429.62077
[16] Dietz, E. and Böhning, D. (1995). Statistical inference based on a general model of unobserved heterogeneity. In Statistical Modelling: Proceedings of the 10th International Workshop (G. U. H. Seeber, B. J. Francis, R. Hatzinger and G. Steckel-Berger, eds.). Lecture Notes in Statistics 104 75-82. Springer, New York.
[17] Dittrich, R., Francis, B., Hatzinger, R. and Katzenbeisser, W. (2007). A paired comparison approach for the analysis of sets of Likert scale responses. Statist. Model. 7 3-28. · doi:10.1177/1471082X0600700102
[18] Dittrich, R., Francis, B., Hatzinger, R. and Katzenbeisser, W. (2010). Missing observations in paired comparison data. Under revision. · Zbl 1072.62617
[19] Dittrich, R., Hatzinger, R. and Katzenbeisser, W. (1998). Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. Appl. Statist. 47 511-525. · Zbl 0915.62063 · doi:10.1111/1467-9876.00125
[20] Dittrich, R., Hatzinger, R. and Katzenbeisser, W. (2004). A log-linear approach for modelling ordinal paired comparison data on motives to start a phd programme. Statist. Model. 4 181-193. · Zbl 1117.62483 · doi:10.1191/1471082X04st072oa
[21] Dittrich, R., Katzenbeisser, W. and Reisinger, H. (2000). The analysis of rank ordered preference data based on Bradley-Terry type models. OR Spektrum 22 117-134. · Zbl 0962.91018 · doi:10.1007/s002910050008
[22] Einbeck, J., Darnell, R. and Hinde, J. (2007). npmlreg: Nonparametric maximum likelihood estimation for random effect models. R package version 0.43.
[23] Fligner, M. and Verducci, J. (1988). Multistage ranking models. J. Amer. Statist. Assoc. 83 892-901. · Zbl 0719.62036 · doi:10.2307/2289322
[24] Fligner, M. and Verducci, J. (1993). Probability Models and Statistical Analyses for Ranking Data. Springer Lecture Notes in Statistics 80 . Springer, New York. · Zbl 0754.00011
[25] Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data. J. Amer. Statist. Assoc. 87 476-486.
[26] Francis, B., Dittrich, R. and Hatzinger, R. (2010). Supplement to “Modeling heterogeneity in ranked responses by non-parametric maximum likelihood: How do Europeans get their scientific knowledge?.” DOI: . · Zbl 1220.62158 · doi:10.1214/10-AOAS366
[27] Francis, B., Dittrich, R., Hatzinger, R. and Penn, R. (2002). Analysing ranks using paired comparison methods: An investigation of value orientation in Europe. Appl. Statist. 51 319-336. · Zbl 1111.62383 · doi:10.1111/1467-9876.00271
[28] Gormley, I. and Murphy, T. (2008a). A mixture of experts model for rank data with applications in election studies. Ann. Appl. Statist. 2 1452-1477. · Zbl 1454.62498 · doi:10.1214/08-AOAS178
[29] Gormley, I. and Murphy, T. (2008b). Exploring voting blocs within Irish electorate. J. Amer. Statist. Assoc. 103 1014-1027. · Zbl 1205.62198 · doi:10.1198/016214507000001049
[30] Hartigan, J. and Kleiner, B. (1984). A mosaic of television ratings. Amer. Statist. 38 32-35.
[31] Hartzel, J., Agresti, A. and Caffo, B. (2001). Multinomial logit random effects models. Statist. Model. 1 81-102. · Zbl 1022.62059 · doi:10.1191/147108201128104
[32] Hatzinger, R. (2009). prefmod: Utilities to fit paired comparison models for preferences. R package version 0.8-17.
[33] Hatzinger, R. and Francis, B. (2004). Fitting paired comparison models in R. Technical Report 3, Department of Statistics and Mathematics, Wirtschaftsuniversität Wien.
[34] Kamakura, W. and Mazzon, J. (1991). Value segmentation? A model for the measurement of values and value systems. J. Consumer Res. 18 208-218.
[35] Lancaster, J. F. and Quade, D. (1983). Random effects in paired-comparison experiments using the Bradley-Terry model. Biometrics 39 245-249. · doi:10.2307/2530824
[36] Mallet, A. (1986). A maximum likelihood estimation method for random coefficient regression models. Biometrika 73 654-656. · Zbl 0615.62083 · doi:10.1093/biomet/73.3.645
[37] Mallows, C. (1957). Non-null ranking models: I. Biometrika 44 114-130. · Zbl 0087.34001 · doi:10.1093/biomet/44.1-2.114
[38] Matthews, J. and Morris, K. (1995). An application of Bradley-Terry-type models to the measurement of pain. Appl. Statist. 44 243-255. · Zbl 0821.62081 · doi:10.2307/2986348
[39] McLachlan, G., Peel, D., Basford, K. and Adams, P. (1999). The EMMIX software for the fitting of mixtures of Normal and t-components. Technical report, Department of Mathematics, University of Queensland.
[40] R Development Core Team (2009). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
[41] Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461-464. · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[42] Sheskin, D. (2007). Handbook of Parametric and Nonparametric Statistical Procedures . Chapman and Hall, London. · Zbl 1118.62001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.