A one-way ANOVA is used to determine whether or not there is a statistically significant difference between the means of three or more independent groups.
If the overall p-value from the ANOVA table is less than some significance level, then we have sufficient evidence to say that at least one of the means of the groups is different from the others.
To find out exactly which group means are different, we must conduct a post hoc test.
You can use the LSMEANS statement in SAS to perform a variety of post-hoc tests.
The following example shows how to use the LSMEANS statement in practice.
Example: How to Use LSMEANS Statement in SAS
Suppose a researcher recruits 30 students to participate in a study. The students are randomly assigned to use one of three studying methods to prepare for an exam.
The exam results for each student are shown below:
We can use the following code to create this dataset in SAS:
/*create dataset*/ data my_data; input Method $ Score; datalines; A 78 A 81 A 82 A 82 A 85 A 88 A 88 A 90 B 81 B 83 B 83 B 85 B 86 B 88 B 90 B 91 C 84 C 88 C 88 C 89 C 90 C 93 C 95 C 98 ; run;
Next, we’ll use proc ANOVA to perform the one-way ANOVA:
/*perform one-way ANOVA*/ proc ANOVA data=my_data; class Method; model Score = Method; run;
This produces the following ANOVA table:
From this table we can see:
- The overall F Value: 5.26
- The corresponding p-value: 0.0140
Recall that a one-way ANOVA uses the following null and alternative hypotheses:
- H0: All group means are equal.
- HA: At least one group mean is different from the rest.
Since the p-value from the ANOVA table (0.0140) is less than α = .05, we reject the null hypothesis.
This tells us that the mean exam score is not equal between the three studying methods.
To determine exactly which group means are different, we can use the PROC GLIMMIX statement along with the LSMEANS statement and the option ADJUST=TUKEY to perform Tukey’s post hoc tests:
/*perform Tukey post-hoc comparisons*/ proc glimmix data=my_data; class Method; model Score = Method; lsmeans Method / adjust=tukey alpha=.05; run;
The last table in the output shows the results of the Tukey post-hoc comparisons:
We can look at the Adj P column to view the adjusted p-values for the difference in group means.
From this column we can see that there is only one row with an adjusted p-value less than .05: the row that compares the mean difference between group A and group C.
This tells us there is a statistically significant difference in mean exam scores between group A and group C.
Specifically, we can see:
- The difference in mean exam scores of students in group A – students in group B was –6.375. (i.e. students in group A had an average exam score of 6.375 points less than students in group C)
- The adjusted p-value for the difference in means is 0.0137.
- The adjusted 95% confidence interval for the true difference in mean exam scores between these two groups is [-11.5219, -1.2281].
There are no statistically significant differences between any other group means.
Note: In this example we used ADJUST=TUKEY to perform Tukey post-hoc comparisons but you can also specify BON, BUNNET, NELSON, SCHEFFE, SIDAK, and SMM to perform other types of post-hoc comparisons.
The following tutorials provide additional information about ANOVA models: