How to Calculate Pooled Variance in R


In statistics, pooled variance refers to the average of two or more group variances.

We use the word “pooled” to indicate that we’re “pooling” two or more group variances to come up with a single number for the common variance between the groups.

In practice, pooled variance is used most often in a two sample t-test, which is used to determine whether or not two population means are equal.

The pooled variance between two samples is typically denoted as sp2 and is calculated as:

sp2 = ( (n1-1)s12 + (n2-1)s22  )  /  (n1+n2-2)

Unfortunately there is no built-in function to calculate the pooled variance between two groups in R, but we can calculate it fairly easily.

For example, suppose we want to calculate the pooled variance between the following two groups:

The following code shows how to calculate the pooled variance between these groups in R:

#define groups of data
x1 <- c(6, 7, 7, 8, 10, 11, 13, 14, 14, 16, 18, 19, 19, 19, 20)
x2 <- c(5, 7, 7, 8, 10, 13, 14, 15, 19, 20, 20, 23, 25, 28, 32)

#calculate sample size of each group
n1 <- length(x1)
n2 <- length(x2)

#calculate sample variance of each group
var1 <- var(x1)
var2 <- var(x2)

#calculate pooled variance between the two groups
pooled <- ((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2)

#display pooled variance
pooled

[1] 46.97143

The pooled variance between these two groups turns out to be 46.97143.

Additional Resources

What is Pooled Variance? (Definition & Example)
Pooled Variance Calculator
How to Calculate Pooled Variance in Excel

Leave a Reply

Your email address will not be published.