An F-Test is used to test if the variances of two populations are equal.
The two-tailed version tests whether or not the variances are equal, while the one-tailed version tests in one direction, that is the variance from the first population is either greater or less than (but not both) the variance fro the second population.
An F-Test is commonly used to answer the following questions:
1. Do two samples come from populations with equal variances?
2. Does a new process/treatment have lower variation than the current process/treatment?
Defining the F Test
The two hypotheses for the F-Test are as follows:
H0 (null hypothesis):
σ12 = σ22 (the two population variances are equal)
HA (alternative hypothesis):
σ12 < σ22 (lower one-tailed test)
σ12 > σ22 (upper one-tailed test)
σ12 ≠ σ22 (two-tailed test)
The test statistic for the F-Test is defined as follows:
F-statistic = s12 / s22
where s12 and s22 are the sample variances. The further this ratio is from one, the stronger the evidence for unequal population variances.
The critical value for the F-Test is defined as follows:
F Critical Value = Fα, N1-1, N1-1 from the F-distribution table with N1-1 and N2-1 degrees of freedom and a significance level of α.
Conducting an F-Test in R
The built-in R function var.test() can be used to compare two variances using the following syntax:
var.test(x, y, ratio = 1, alternative = c(“two.sided”, “less”, “greater”), conf.level = 0.95, …)
- x, y – numeric vectors
- ratio – hypothesized ratio of the population variances of x and y (default is 1)
- alternative – the alternative hypothesis of the test (default is “two.sided”)
- conf.level – optional confidence level for the test (default is 0.95)
The following code illustrates how to conduct an F-Test for two samples x and y:
#create two vectors x and y x <- rnorm(n = 100, mean = 1, sd = 2.7) y <- rnorm(n = 100, mean = 1, sd = 2) #conduct two.sided F-test to test for equality of variances var.test(x, y) # F test to compare two variances # #data: x and y #F = 1.6294, num df = 99, denom df = 99, p-value = 0.01592 #alternative hypothesis: true ratio of variances is not equal to 1 #95 percent confidence interval: # 1.096316 2.421641 #sample estimates: #ratio of variances # 1.629381
We can see that the F-statistic for the test is 1.6294 and the corresponding p-value is 0.01592.
Since the p-value is less than our significance level of 0.05, we have sufficient evidence to reject the null hypothesis and say that the difference between the two variances is statistically significant.