ANCOVA stands for “analysis of covariance.” To understand how an ANCOVA works, it helps to first understand the ANOVA.
An ANOVA (analysis of variance) is used to determine whether or not there is a statistically significant difference between the means of three or more independent groups.
For example, suppose we want to know whether or not studying technique has an impact on exam scores for a class of students. We randomly split the class into three groups. Each group uses a different studying technique for one month to prepare for an exam. At the end of the month, all of the students take the same exam.
To find out if studying technique impacts exam scores, we can conduct a one-way ANOVA, which will tell us if if there is a statistically significant difference between the mean scores of the three groups.
An ANCOVA is an extension of an ANOVA in which we’d like to determine if there is a statistically significant difference between three or more independent groups after accounting for one or more covariates.
A covariate is a continuous variable that co-varies with the response variable.
For example, suppose we want to know whether or not studying technique has an impact on exam scores, but we want to account for the grade that the student already has in the class. We can use their current grade as a covariate and conduct an ANCOVA to determine if there is a statistically significant difference between the mean exam scores of the three groups.
This allows us to test whether or not studying technique has an impact on exam scores after the influence of the covariate has been removed.
Thus, if we find that there is a statistically significant difference in exam scores between the three studying techniques, we can be sure that this difference exists even after accounting for the students current grade in the class (i.e. if they’re already doing well or not in the class).
Assumptions of ANCOVA
Before performing an ANCOVA, it’s important to make sure the following assumptions are met:
- The covariate(s) and the factor variable(s) are independent – The covariate and the factor variable should be independent of each other, since adding a covariate term into the model only makes sense if the covariate and the factor variable act independently on the response variable.
- The covariate(s) are continuous data. The covariates should be continuous (i.e. either interval or ratio data).
- Homogeneity of variances – The variances among the groups should be roughly equal.
- Independence – The observations in each group should be independent.
- Normality – The data should be roughly normally distributed in each group.
- No extreme outliers – There should be no extreme outliers in any of the groups that could significantly affect the results of the ANCOVA.
A teacher wants to know if three different studying techniques have an impact on exam scores, but she wants to account for the current grade that the student already has in the class.
She will perform an ANCOVA using the following variables:
- Factor variable: studying technique
- Covariate: current grade
- Response variable: exam score
The following table shows the dataset for the 15 students that were recruited to participate in the study:
After performing an ANCOVA on the dataset, the teacher ends up with the following results:
The p-value for study technique is 0.03155. Since this value is less than 0.05, we can reject the null hypothesis that each of the studying techniques leads to the same average exam score, even after accounting for the student’s current grade in the class.
To determine exactly which studying techniques produce different average exam scores, the teacher would need to run post-hoc tests.