An intraclass correlation coefficient (ICC) is used to determine if items or subjects can be rated reliably by different raters.
The value of an ICC can range from 0 to 1, with 0 indicating no reliability among raters and 1 indicating perfect reliability.
The easiest way to calculate ICC in R is to use the icc() function from the irr package, which uses the following syntax:
icc(ratings, model, type, unit)
where:
- ratings: A dataframe or matrix of ratings
- model: The type of model to use. Options include “oneway” or “twoway”
- type: The type of relationship to calculate between raters. Options include “consistency” or “agreement”
- unit: The unit of analysis. Options include “single” or “average”
This tutorial provides an example of how to use this function in practice.
Step 1: Create the Data
Suppose four different judges were asked to rate the quality of 10 different college entrance exams. We can create the following dataframe to holding the ratings of the judges:
#create data data <- data.frame(A=c(1, 1, 3, 6, 6, 7, 8, 9, 8, 7), B=c(2, 3, 8, 4, 5, 5, 7, 9, 8, 8), C=c(0, 4, 1, 5, 5, 6, 6, 9, 8, 8), D=c(1, 2, 3, 3, 6, 4, 6, 8, 8, 9))
Step 2: Calculate the Intraclass Correlation Coefficient
Suppose the four judges were randomly selected from a population of qualified entrance exam judges and that we’d like to measure the absolute agreement among judges and that we’re interested in using the ratings from a single rater perspective as the basis for our measurement.
We can use the following code in R to fit a two-way model, using absolute agreement as the relationship among raters, and using single as our unit of interest:
#load the interrater reliability package library(irr) #define data data <- data.frame(A=c(1, 1, 3, 6, 6, 7, 8, 9, 8, 7), B=c(2, 3, 8, 4, 5, 5, 7, 9, 8, 8), C=c(0, 4, 1, 5, 5, 6, 6, 9, 8, 8), D=c(1, 2, 3, 3, 6, 4, 6, 8, 8, 9)) #calculate ICC icc(data, model = "twoway", type = "agreement", unit = "single") Model: twoway Type : agreement Subjects = 10 Raters = 4 ICC(A,1) = 0.782 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(9,30) = 15.3 , p = 5.93e-09 95%-Confidence Interval for ICC Population Values: 0.554 < ICC < 0.931
The intraclass correlation coefficient (ICC) turns out to be 0.782.
Here is how to interpret the value of an intraclass correlation coefficient, according to Koo & Li:
- Less than 0.50: Poor reliability
- Between 0.5 and 0.75: Moderate reliability
- Between 0.75 and 0.9: Good reliability
- Greater than 0.9: Excellent reliability
Thus, we would conclude that an ICC of 0.782 indicates that the exams can be rated with “good” reliability by different raters.
A Note on Calculating ICC
There are several different versions of an ICC that can be calculated, depending on the following three factors:
- Model: One-Way Random Effects, Two-Way Random Effects, or Two-Way Mixed Effects
- Type of Relationship: Consistency or Absolute Agreement
- Unit: Single rater or the mean of raters
In the previous example, the ICC that we calculated used the following assumptions:
- Model: Two-Way Random Effects
- Type of Relationship: Absolute Agreement
- Unit: Single rater
For a detailed explanation of these assumptions, please refer to this article.