Title: How to Assess Reviewer Rankings? A Theoretical and an Applied Approach
Authors: Larissa Bartok, Matthias A. Burzler
Affiliation: Modul University Vienna, Austria; University Vienna, Austria;
University of Applied Sciences Wiener Neustadt, Austria
Abstract:
Although the evaluation of inter-rater agreement is often necessary in
psychometric procedures (e.g. standard-settings), measures are not
unproblematic. Cohen's kappa and kappan are known for penalizing raters in
specific settings. Krippendorff's alpha, on the other hand, does not fit
precisely to every rating problem. The talk discusses a new approach to
investigate the probability of consistencies or discrepancies in a setting
where n independent raters rank k items. Here, we provide a suggestion for
using a discrete theoretical probability distribution to evaluate the
probability of the empirically retrieved rating results. We compare the
pairwise absolute row differences of an empirically obtained nxk matrix with
the theoretically expected differences assuming raters to randomly rank items.
In the simplest scenario there are k! permutations of one reviewer ranking k
items. If n independent reviewers rank k items __k!____^n
permutations are possible and therefore the probability to obtain a matrix
with zero differences (all reviewers rank equally) can be calculated using
k!/____k!____^n. We present both theoretical considerations about the
resulting discrete probability function and a practical computational
implication using R. In this context we illuminate the instability of ranking
systems per se.
__