Inter-Rater Reliability Using SAS
The Practical Guide for Nominal, Ordinal, and Interval Data
by Kilem L. Gwet. Ph.D.

Inter-Rater Reliability Using SAS

The primary objective of this book is to show practitioners simple step-by-step approaches for organizing rating data, creating SAS datasets, and using appropriate SAS procedures, or special SAS macro programs to compute various inter-rater reliability coefficients. The author always starts with a brief and non-mathematical description of the agreement coefficients used in this book, before showing how they are calculated with SAS. The non-mathematical description of these coefficients is done using simple numeric examples to show their functionality. The author offers practical SAS solutions for 2 raters as well as for 3 raters and more.
The FREQ procedure of SAS offers the calculation of Cohen's Kappa as an option, when the number of raters is limited to 2. The introduction of this feature is without doubt a very welcome addition to the system. But in addition to offering only Kappa as the only agreement coefficient, the use of FREQ to compute Kappa is full of pitfalls that could easily lead a careless practitioner to wrong results. For example, if one rater does not use one category that another rater has used, SAS does not compute any Kappa at all. This problem is referred to in chapter 1 as the unbalanced-table issue. Even more seriously, if both raters use the same number of different categories, SAS will produce "very wrong" results, because the FREQ procedure will be matching wrong categories to determine agreement. This issue is referred to in chapter 1 as the "Diagonal Issue." There are actually a few other potentially serious problems with weighted Kappa that the author has identified. They are all clearly documented in this book, and a plan for resolving each of them is proposed.

