
- Defined by Shrout and Fleiss (1979) and McGraw and Wong (1996).

- Calculate the degree of consistency among single measurements rated by a specific set of raters.


The degree of consistency among single measurements rated by a specific set of raters.
Koo and Li (2016) gives the following suggestion for interpreting ICC (Koo and Li 2016):
below 0.50: poor
between 0.50 and 0.75: moderate
between 0.75 and 0.90: good
above 0.90: excellent

Underlying Model: Two-way mixed effects model

\(x_{ij} = \mu + r_i + c_j + rc_{ij} + e_{ij} \)


\(\mu\): The population mean of observations.

\(r_i\): The row effects (effects between each sample) are random, independent and normally distributed with mean 0 and variance \(\sigma_{r}^2\).

\(c_j\): The row effects (effects between each sample) are random, independent and normally distributed with mean 0 and variance \(\sigma_{c}^2\).

\(rc_{ij}\): Interaction effect, also random, independent and normally distributed with mean 0 and variance \(\sigma_{rc}^2\).

\(e_{ij}\): The residual error are random, independent and normally distributed with mean 0 and variance \(\sigma_{e}^2\). All residual effects are pairwise independent.


\(\frac{MS_R - MS_E}{MS_R + (k-1) MS_E}\)

\(MS_R\) = mean square for rows;

\(MS_C\) = mean square for columns;

\(MS_E\) = mean square error;

k = number of measurements (number of columns);

Example in R

## Loading required package: lpSolve
data("anxiety", package = "irr")
##    rater1 rater2 rater3
## 1       3      3      2
## 2       3      6      1
## 3       3      4      4
## 4       4      6      4
## 5       5      2      3
## 6       5      4      2
## 7       2      2      1
## 8       3      4      6
## 9       5      3      1
## 10      2      3      1
## 11      2      2      1
## 12      6      3      2
## 13      1      3      3
## 14      5      3      3
## 15      2      2      1
## 16      2      2      1
## 17      1      1      3
## 18      2      3      3
## 19      4      3      2
## 20      3      4      2

Use irr Library, need to specify model, type, unit:

    anxiety, model = "twoway", 
    type = "consistency", unit = "single"
##  Single Score Intraclass Correlation
##    Model: twoway 
##    Type : consistency 
##    Subjects = 20 
##      Raters = 3 
##    ICC(C,1) = 0.216
##  F-Test, H0: r0 = 0 ; H1: r0 > 0 
##    F(19,38) = 1.83 , p = 0.0562 
##  95%-Confidence Interval for ICC Population Values:
##   -0.046 < ICC < 0.522

Use psych Library, calculate all type at once, for ICC(3,1,C), read from “Single_fixed_raters”:

## Call: ICC(x = anxiety)
## Intraclass correlation coefficients 
##                          type  ICC   F df1 df2     p lower bound upper bound
## Single_raters_absolute   ICC1 0.18 1.6  19  40 0.094     -0.0405        0.44
## Single_random_raters     ICC2 0.20 1.8  19  38 0.056     -0.0045        0.45
## Single_fixed_raters      ICC3 0.22 1.8  19  38 0.056     -0.0073        0.48
## Average_raters_absolute ICC1k 0.39 1.6  19  40 0.094     -0.1323        0.70
## Average_random_raters   ICC2k 0.43 1.8  19  38 0.056     -0.0136        0.71
## Average_fixed_raters    ICC3k 0.45 1.8  19  38 0.056     -0.0222        0.73
##  Number of subjects = 20     Number of Judges =  3

Live Example: Try this yourself


Koo, Terry, and Mae Li. 2016. “A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research.” Journal of Chiropractic Medicine 15 (March). doi:10.1016/j.jcm.2016.02.012.
Shrout, P.E., and J.L. Fleiss. 1979. “Intraclass Correlation: Uses in Assessing Rater Reliability.” Psychological Bulletin 86: 420–28. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1996;1:30–46. [Google Scholar]