**Introduction**

A **one-way analysis of variance** (ANOVA) is used to determine whether or not there is a statistically significant difference between the means of three or more independent groups.

Here are a couple examples of when you might conduct a one-way ANOVA:

**Example 1:** You randomly split up a class of 90 students into three groups of 30. Each group uses a different studying technique for one month to prepare for an exam. At the end of the month, all of the students take the same exam. You want to know whether or not the studying technique has an impact on exam scores so you conduct a one-way ANOVA to determine if there is a statistically significant difference between the mean scores of the three groups.

**Example 2:** You want to know whether or not sunlight impacts the growth of a certain plant, so you plant groups of seeds in four different locations that experience either high sunlight, medium sunlight, low sunlight or no sunlight. After one month you measure the height of each group of plants. To determine if sunlight impacts growth, you conduct a one-way ANOVA to determine if there is a statistically significant difference between the mean height of the four groups.

This type of test is called a *one-way* ANOVA because we are only analyzing how *one *factor impacts our response variable.

In **Example 1**, the one factor we are analyzing is “studying technique” to see how it impacts exam scores. If we instead analyzed “studying technique” and another factor like “gender”, then we would use a *two-way *ANOVA to see how these two factors impact exam scores.

In **Example 2**, the one factor we are analyzing is “sunlight amount” to see how it impacts plant growth. If we instead analyzed “sunlight amount”, “plant species”, and “water amount”, then we would use a *three-way *ANOVA to see how these three factors impact plant growth.

**One-Way ANOVA Assumptions**

Before we can conduct a one-way ANOVA, we need to make sure the following assumptions are met:

**1. Normality** – all populations that we’re studying follow a normal distribution. So, for example, if we want to compare the exam scores of three different groups of students, the exam scores for the first group, second group, and third group all need to be normally distributed.

**2. Equal Variance** – the population variances in each group are equal or approximately equal.

**3. Independence** – the observations in each group need to be independent of each other. Usually a randomized design will take care of this.

If these assumptions are met, then we can proceed with conducting a one-way ANOVA.

**Example of a One-Way ANOVA**

Jessica randomly splits up her class of 15 students into three groups of 5. Each group uses a different studying technique for one month – studying while listening to music, studying while listening to white noise, and studying with no noise. At the end of the month, all of the students take the same exam. The following table shows the exam scores for the three groups:

Music | White Noise | No Noise |
---|---|---|

78 | 94 | 91 |

84 | 88 | 88 |

86 | 92 | 86 |

83 | 85 | 85 |

90 | 86 | 80 |

Jessica wants to know whether or not studying technique has an impact on exam scores so she conducts a one-way ANOVA using a .05 significance level to determine if there is a statistically significant difference between the mean scores of the three groups.

Here we will demonstrate how to conduct a one-way ANOVA by hand and also by using a one-way ANOVA calculator.

**One-Way ANOVA by Hand**

To conduct a one-way ANOVA by hand, we follow the standard five steps for any hypothesis test:

**Step 1. State the hypotheses. **

The null hypothesis (H_{0}): µ_{1} = µ_{2} = µ_{3} (the means are equal for each group)

The alternative hypothesis: (Ha): at least one of the means is different from the others

**Step 2. Determine a significance level to use.**

The problem tells us that Jessica chose to use a .05 significance level.

**Step 3. Make an ANOVA table and find the F statistic and corresponding p-value.**

To make an ANOVA table, we need to find the following four numbers for each group:

- the sum of all values
**Sx** - the sum of all values squared
**(Sx)**^{2} - the sum of each squared value
**Sx**^{2} - the mean
**x**

Here is how to find these numbers for each group:

Music | White Noise | No Noise |
---|---|---|

78 | 94 | 91 |

84 | 88 | 88 |

86 | 92 | 86 |

83 | 85 | 85 |

90 | 86 | 80 |

Sx = 78+84+86+83+90 = 421_{1} |
Sx = 94+88+92+85+86 = 445_{2} |
Sx = 91+88+86+85+80 = 430_{3} |

(Sx = (421)_{1})^{2}^{2} = 177241 |
(Sx = (445)_{2})^{2}^{2} = 198025 |
(Sx = (430)_{3})^{2}^{2} = 184900 |

Sx = 78_{1}^{2}^{2}+84^{2}+86^{2}+83^{2}+90^{2} = 35525 |
Sx = 94_{2}^{2}^{2}+88^{2}+92^{2}+85^{2}+86^{2} = 39665 |
Sx = 91_{3}^{2}^{2}+88^{2}+86^{2}+85^{2}+80^{2} = 37046 |

x = (78+84+86+83+90) / 5 = 84.2_{1} |
x = (94+88+92+85+86) / 5 = 89_{2} |
x = (91+88+86+85+80) / 5 = 86_{3} |

Next, we need to find the total sum of squares, treatment sum of squares, and error sum of squares:

SS_{total} = (Sx_{1}^{2} + Sx_{2}^{2} + Sx_{3}^{2}) – [ (Sx_{1} + Sx_{2} + Sx_{3})^{2} / n] = (35525+39665+37046) – [(421+445+430)** ^{2}** / 15] = 261.6

SS_{treatment} = [ (Sx_{1})^{2 }/n_{1} + (Sx_{2})^{2}/n_{2} + (Sx_{3})^{2 }/ n_{3} ] – [ (Sx_{1} + Sx_{2} + Sx_{3})^{2} / n] = [ 177241/5 + 198025/5 + 184900/5 ] – [(421+445+430)** ^{2}** / 15] = 58.8

SS_{error} = SS_{total} – SS_{treatment} = 261.6 – 58.8 = 202.8

Once we have these numbers, we simply need to fill in the ANOVA table, which looks like this:

Source | SS | df | MS | F | P |
---|---|---|---|---|---|

Treatment | SS_{treatment} |
# groups-1 | SS_{treatment }/ df_{treatment} |
MS_{treatment }/ MS_{error} |
p-value for F_{dftreatment}, _{dferror} |

Error | SS_{error} |
(n-1) – (#groups-1) | SS_{error }/ df_{error} |
||

Total | SS_{total} |
n-1 |

So, filling in our numbers we get:

Source | SS | df | MS | F | P |
---|---|---|---|---|---|

Treatment | 58.8 | 3-1 = 2 | _{58.8 / 2 = 29.4} |
29.4 / 16.9 = 1.74 | 0.217 |

Error | 202.8 | (14) – (2) = 12 | _{202.8 / 12 = 16.9} |
||

Total | 261.6 | 15-1 = 14 |

*Note: We got the p-value from using the F distribution calculator with numerator degrees of freedom = 2, denominator degrees of freedom = 12, and F-value = 1.74, which gives us a p-value of .783. Then 1 – .783 = .217.*

**Step 4. Reject or fail to reject the null hypothesis.**

Since the p-value is greater than our significance level of .05, we fail to reject the null hypothesis.

**Step 5. Interpret the results. **

Since we failed to reject the null hypothesis, we do not have sufficient evidence to say that study technique has an impact on exam scores.

**One-Way ANOVA by Calculator**

Instead of calculating all of these numbers by hand, we could just use a One-Way ANOVA calculator. Using this calculator, we can simply enter the exam scores for each group and then hit the “Calculate” button:

Notice how these numbers match the numbers we calculated by hand.

**The Non-technical Way to Interpret a One-Way ANOVA**

Here is how to think about a one-way ANOVA in a non-technical way:

In the example above we had the exam scores for three groups of students. If we just graphed the scores of those three groups, here’s what it would look like: (each dot represents a student)

We can visually see that the scores for the students who studied with white noise tend to be slightly higher, but the reason that we conduct a one-way ANOVA is to determine if this difference in scores is due to random variation (i.e. smarter students just randomly landed in the white noise group) or due to some other factor (i.e. the studying technique).

When we conduct a one-way ANOVA, the MS_{treatment} tells us the “average variation between groups” and the MS_{error} tells us the “average variation within groups.” Then, the F-value is the ratio of MS_{treatment} to MS_{error}.

If this F-value is high, it indicates that the variation *between *the groups is noticeably higher than the variation *within *the groups, which means that the difference in exam scores between the groups is due to the studying technique.

And to determine whether or not an F-value is “high” we simply look at the corresponding p-value. In our example, we found that the p-value was 0.217, which was higher than our significance level of 0.05.

This tells us that even though there were differences in the exam scores between the three groups of students, the differences between the groups wasn’t large enough to be considered statistically significant, so we weren’t able to conclude that studying technique was the reason for the difference in exam scores.