A two sample t-test is used to test whether or not the means of two populations are equal.
This tutorial provides a complete guide on how to interpret the results of a two sample t-test in R.
Step 1: Create the Data
Suppose we want to know if two different species of plants have the same mean height. To test this, we collect a simple random sample of 12 plants from each species.
#create vector of plant heights from group 1 group1 <- c(8, 8, 9, 9, 9, 11, 12, 13, 13, 14, 15, 19) #create vector of plant heights from group 2 group2 <- c(11, 12, 13, 13, 14, 14, 14, 15, 16, 18, 18, 19)
Step 2: Perform & Interpret the Two Sample t-test
Next, we will use the t.test() command to perform a two sample t-test:
#perform two sample t-test t.test(group1, group2) Welch Two Sample t-test data: group1 and group2 t = -2.5505, df = 20.488, p-value = 0.01884 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -5.6012568 -0.5654098 sample estimates: mean of x mean of y 11.66667 14.75000
Here’s how to interpret the results of the test:
data: This tells us the data that was used in the two sample t-test. In this case, we used the vectors called group1 and group2.
t: This is the t test-statistic. In this case, it is -2.5505.
df: This is the degrees of freedom associated with the t test-statistic. In this case, it’s 20.488. Refer to the Satterthwaire approximation for an explanation of how this degrees of freedom value is calculated.
p-value: This is the p-value that corresponds to a t test-statistic of -2.5505 and df = 20.488. The p-value turns out to be .01884. We can confirm this value by using the T Score to P Value calculator.
alternative hypothesis: This tells us the alternative hypothesis used for this particular t-test. In this case, the alternative hypothesis is that the true difference in means between the two groups is not equal to zero.
95 percent confidence interval: This tells us the 95% confidence interval for the true difference in means between the two groups. It turns out to be [-.5601, -.5654].
sample estimates: This tells us the sample mean of each group. In this case, the sample mean of group 1 was 11.667 and the sample mean of group 2 was 14.75.
The two hypotheses for this particular two sample t-test are as follows:
H0: µ1 = µ2 (the two population means are equal)
HA: µ1 ≠µ2 (the two population means are not equal)
Because the p-value of our test (.01884) is less than alpha = 0.05, we reject the null hypothesis of the test. This means we have sufficient evidence to say that the mean height of plants between the two populations is different.
The t.test() function in R uses the following syntax:
t.test(x, y, alternative = “two.sided”, mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95)
- x, y: The names of the two vectors that contain the data.
- alternative: The alternative hypothesis. Options include “two.sided”, “less”, or “greater.”
- mu: The value assumed to be the true difference in means.
- paired: Whether or not to use a paired t-test.
- var.equal: Whether or not the variances are equal between the two groups.
- conf.level: The confidence level to use for the test.
In our example above, we used the following assumptions:
- We used a two-sided alternative hypothesis.
- We tested whether or not the true difference in means was equal to zero.
- We used a two sample t-test, not a paired t-test.
- We didn’t make the assumption that the variances were equal between the groups.
- We used a 95% confidence level.
Feel free to change any of these arguments when you conduct your own t-test, depending on the particular test you want to perform.