How to Calculate Sum of Squares in ANOVA (With Example)


In statistics, a one-way ANOVA is used to compare the means of three or more independent groups to determine if there is a statistically significant difference between the corresponding population means.

Whenever you perform a one-way ANOVA, you will always compute three sum of squares values:

1. Sum of Squares Regression (SSR)

  • This is the sum of the squared differences between each group mean and the grand mean.

2. Sum of Squares Error (SSE)

  • This is the sum of the squared differences between each individual observation and the group mean of that observation.

3. Sum of Squares Total (SST)

  • This is the sum of the squared differences between each individual observation and the grand mean.

Each of these three values are placed in the final ANOVA table, which we use to determine whether or not there is a statistically significant difference between the group means.

The following example shows how to calculate each of these sum of squares values for a one-way ANOVA in practice.

Example: How to Calculate Sum of Squares in ANOVA

Suppose we want to know whether or not three different exam prep programs lead to different mean scores on a certain exam. To test this, we recruit 30 students to participate in a study and split them into three groups.

The students in each group are randomly assigned to use one of the three exam prep programs for the next three weeks to prepare for an exam. At the end of the three weeks, all of the students take the same exam. 

The exam scores for each group are shown below:

Example one-way ANOVA data

The following steps show how to calculate the sum of squares values for this one-way ANOVA.

Step 1: Calculate the group means and the grand mean.

First, we will calculate the mean for all three groups along with the grand (or “overall”) mean:

Step 2: Calculate SSR.

Next, we will calculate the sum of squares regression (SSR) using the following formula:

nΣ(XjX..)2 

where:

  • n: the sample size of group j
  • Σ: a greek symbol that means “sum”
  • Xj: the mean of group j
  • X..: the overall mean

In our example, we calculate that SSR = 10(83.4-85.8)2 + 10(89.3-85.8)2 + 10(84.7-85.8)2 = 192.2

Step 3: Calculate SSE.

Next, we will calculate the sum of squares error (SSE) using the following formula:

Σ(XijXj)2 

where:

  • Σ: a greek symbol that means “sum”
  • Xij: the ith observation in group j
  • Xj: the mean of group j

In our example, we calculate SSE as follows:

Group 1: (85-83.4)2 + (86-83.4)+ (88-83.4)+ (75-83.4)+ (78-83.4)+ (94-83.4)+ (98-83.4)+  (79-83.4)+ (71-83.4)+ (80-83.4)640.4

Group 2: (91-89.3)2 + (92-89.3)+ (93-89.3)+ (85-89.3)+ (87-89.3)+ (84-89.3)+ (82-89.3)+  (88-89.3)+ (95-89.3)+ (96-89.3)208.1

Group 3: (79-84.7)2 + (78-84.7)+ (88-84.7)+ (94-84.7)+ (92-84.7)+ (85-84.7)+ (83-84.7)+  (85-84.7)+ (82-84.7)+ (81-84.7)252.1

SSE: 640.4 + 208.1 + 252.1 = 1100.6

Step 4: Calculate SST.

Next, we will calculate the sum of squares total (SST) using the following formula:

SST = SSR + SSE

In our example, SST = 192.2 + 1100.6 = 1292.8

Once we have calculated the values for SSR, SSE, and SST, each of these values will eventually be placed in the ANOVA table:

Source Sum of Squares (SS) df Mean Squares (MS) F-value p-value
Regression 192.2 2 96.1 2.358 0.1138
Error 1100.6 27 40.8    
Total 1292.8 29      

Here is how we calculated the various numbers in the table:

  • df regression: k-1 = 3-1 = 2
  • df error: n-k = 30-3 = 27
  • df total: n-1 = 30-1 = 29
  • MS treatment: SST / df treatment = 192.2 / 2 = 96.1
  • MS error: SSE / df error = 1100.6 / 27 = 40.8
  • F-value: MS treatment / MS error = 96.1 / 40.8 = 2.358
  • p-value: p-value that corresponds to F value.

Note: n = total observations, k = number of groups

Check out this tutorial for how to interpret the F-Value and p-value in the ANOVA table.

Leave a Reply

Your email address will not be published.