A note on the linearly weighted kappa coefficient for ordinal scales. (English) Zbl 1220.62172

Summary: A frequent criticism formulated against the use of weighted kappa coefficients is that the weights are arbitrarily defined. We show that using linear weights for a \(K\)-ordinal scale is equivalent to deriving a kappa coefficient from \(K-1\) embedded \(2\times 2\) tables.


62P99 Applications of statistics
62H17 Contingency tables
62P15 Applications of statistics to psychology
Full Text: DOI Link


[1] Agresti, A., Categorical Data Analysis (2002), John Wiley and Sons: John Wiley and Sons New York · Zbl 1018.62002
[2] Brenner, H.; Kliebsch, U., Dependence of weighed kappa coefficients on the number of categories, Epidemiology, 7, 199-202 (1996)
[3] Cicchetti, D.; Allison, T., A new procedure for assessing reliability of scoring eeg sleep recordings, American Journal EEG Technology, 11, 101-109 (1971)
[4] Cohen, J., A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20, 37-46 (1960)
[5] Cohen, J., Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit, Psychological Bulletin, 70, 213-220 (1968)
[6] Fleiss, J. L.; Cohen, J., The equivalence of weighted kappa and the intraclass correlation coefficient as measure of reliability, Educational and Psychological Measurement, 33, 613-619 (1973)
[7] Gilmour, E.; Ellerbrock, T.; Koulos, J.; Chiasson, M.; Williamson, J.; Kubn, L.; Wright, T. J., Measuring cervical ectopy: Direct visual assessment versus computerized planimetry, American Journal of Obstetrics and Gynecology, 176, 108-111 (1997)
[8] Graham, P.; Jackson, R., The analysis of ordinal agreement data: Beyond weighted kappa, Journal of Clinical Epidemiology, 46, 1055-1062 (1993)
[9] Ludbrook, J., Statistical techniques for comparing measures and methods of measurement: A critical review, Clinical and Experimental Pharmacology and Physiology, 29, 527-536 (2002)
[10] Schuster, C., A note on the interpretation of weighted kappa and its relation to other rater agreement statistics for metric scales, Educational and Psychological Measurement, 64, 243-253 (2004)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.