Within-Group vs. Between Group Variation in ANOVA


A one-way ANOVA is used to determine whether or not the means of three or more independent groups are equal.

A one-way ANOVA uses the following null and alternative hypotheses:

  • H0: All group means are equal.
  • HA: At least one group mean is different from the rest.

Whenever you perform a one-way ANOVA, you will end up with a summary table that looks like the following:

We can see that there are two different sources of variation that an ANOVA measures:

Between Group Variation: The total variation between each group mean and the overall mean.

Within-Group Variation: The total variation in the individual values in each group and their group mean.

If the Between group variation is high relative to the Within-group variation, then the F-statistic of the ANOVA will be higher and the corresponding p-value will be lower, which makes it more likely that we’ll reject the null hypothesis that the group means are equal.

The following example shows how to calculate the Between group variation and Within-group variation for a one-way ANOVA in practice.

Example: Calculating Within-Group and Between Group Variation in ANOVA

Suppose we want to determine if three different studying methods lead to different mean exam scores. To test this, we recruit 30 students and randomly assign 10 each to use a different studying method.

The exam scores for the students in each group are shown below:

We can use the following formula to calculate the between group variation:

Between Group Variation = Σnj(XjX..)2 

where:

  • nj: the sample size of group j
  • Σ: a symbol that means “sum”
  • Xj: the mean of group j
  • X..: the overall mean

To calculate this value, we’ll first calculate each group mean and the overall mean:

Then we calculate the between group variation to be: 10(80.5-83.1)2 + 10(82.1-83.1)2 + 10(86.7-83.1)2 = 207.2.

Next, we can use the following formula to calculate the within group variation:

Within Group Variation: Σ(XijXj)2 

where:

  • Σ: a symbol that means “sum”
  • Xij: the ith observation in group j
  • Xj: the mean of group j

In our example, we calculate within group variation to be:

Group 1: (75-80.5)2 + (77-80.5)+ (78-80.5)+ (78-80.5)+ (79-80.5)+ (81-80.5)+ (81-80.5)+  (83-80.5)+ (86-80.5)+ (87-80.5)136.5

Group 2: (78-82.1)2 + (78-82.1)+ (79-82.1)+ (81-82.1)+ (81-82.1)+ (82-82.1)+ (83-82.1)+  (85-82.1)+ (86-82.1)+ (88-82.1)104.9

Group 3: (82-86.7)2 + (82-86.7)+ (84-86.7)+ (86-86.7)+ (86-86.7)+ (87-86.7)+ (87-86.7)+  (89-86.7)+ (90-86.7)+ (94-86.7)122.1

Within Group Variation: 136.5 + 104.9 + 122.1 = 363.5

If we use statistical software to perform a one-way ANOVA using this dataset, we’ll end up with the following ANOVA table:

Notice that the between group and within-group variation values match the ones we calculated by hand.

The overall F-statistic in the table is a way to quantify the ratio of the between group variation compared to the within group variation.

The larger the F-statistic, the greater the variation between group means relative to the variation within the groups.

Thus, the larger the F-statistic, the greater the evidence that there is a difference between the group means.

We can see in this example that the p-value that corresponds to an F-statistic of 7.6952 is .0023.

Since this value is less than α = .05, we reject the null hypothesis of the ANOVA and conclude that the three studying techniques do not lead to the same exam score.

Additional Resources

The following tutorials provide additional information about ANOVA models:

Introduction to the One-Way ANOVA
How to Interpret the F-Value and P-Value in ANOVA
The Complete Guide: How to Report ANOVA Results

Leave a Reply

Your email address will not be published.