Welch’s t-test is used to compare the means between two independent groups when it is *not* assumed that the two groups have equal variances.

To perform Welch’s t-test in R, we can use the **t.test()** function, which uses the following syntax:

**t.test(x, y, alternative = c(“two.sided”, “less”, “greater”))**

where:

**x:**A numeric vector of data values for the first group**y:**A numeric vector of data values for the second group**alternative:**The alternative hypothesis for the test. Default is two.sided.

The following example shows how to use this function to perform Welch’s t-test in R.

**Example: Welch’s t-test in R**

A teacher wants to compare the exam scores of 12 students who used an exam prep booklet to prepare for some exam vs. 12 students who did not.

The following vectors show the exam scores for the students in each group:

booklet <- c(90, 85, 88, 89, 94, 91, 79, 83, 87, 88, 91, 90) no_booklet <- c(67, 90, 71, 95, 88, 83, 72, 66, 75, 86, 93, 84)

Before we perform a Welch’s t-test, we can first create boxplots to visualize the distribution of scores for each group:

boxplot(booklet, no_booklet, names=c("Booklet","No Booklet"))

We can clearly see that the “Booklet” group has a higher mean score and lower variance in scores.

To formally test whether or not the mean scores between the groups are significantly different, we can perform Welch’s t-test:

#perform Welch's t-test t.test(booklet, no_booklet) Welch Two Sample t-test data: booklet and no_booklet t = 2.2361, df = 14.354, p-value = 0.04171 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 0.3048395 13.8618272 sample estimates: mean of x mean of y 87.91667 80.83333

From the output we can see that the *t* test-statistic is **2.2361** and the corresponding p-value is **0.04171**.

Since this p-value is less than .05, we can reject the null hypothesis and conclude that there is a statistically significant difference in mean exam scores between the two groups.

The **t.test()** function also provides us with the following information:

- The 95% confidence interval for the difference in mean exam scores between the two groups is
**[0.3048, 13.8618**]. - The mean exam score of the first group is
**87.91667**. - The mean exam score of the second group is
**80.83333**.

*You can find the complete documentation for the t.test() function here.*

**Additional Resources**

The following tutorials explain how to perform other common tasks in R:

How to Perform a One Sample t-test in R

How to Perform a Two Sample t-test in R

How to Perform a Paired Samples t-test in R

How to Plot Multiple Boxplots in One Chart in R

Could you please explain how the 95% confidence interval for the difference in mean exam scores between the two groups is [0.3048, 13.8618], while the actual measured difference between the means of the two groups is only around .07 (thereby falling outside of the 95% confidence interval).

It just doesn’t make sense to me that the actual difference in means between the two samples used to conduct the test falls outside the 95% confidence interval, calculated from those very same samples.