How to Remove Outliers in Boxplots in R


Occasionally you may want to remove outliers from boxplots in R. This tutorial explains how to do so using both base R and ggplot2.

Remove Outliers in Boxplots in Base R

Suppose we have the following dataset:

data <- c(5, 8, 8, 12, 14, 15, 16, 19, 20, 22, 24, 25, 25, 26, 30, 48)

The following code shows how to create a boxplot for this dataset in base R:

boxplot(data)

To remove the outliers, you can use the argument outline=FALSE:

boxplot(data, outline=FALSE)

Boxplot with outlier removed in R

Remove Outliers in Boxplots in ggplot2

Suppose we have the following dataset:

data <- data.frame(y=c(5, 8, 8, 12, 14, 15, 16, 19, 20, 22, 24, 25, 25, 26, 30, 48))

The following code shows how to create a boxplot using the ggplot2 visualization library:

library(ggplot2)

ggplot(data, aes(y=y)) +
  geom_boxplot()

To remove the outliers, you can use the argument outlier.shape=NA:

ggplot(data, aes(y=y)) +
  geom_boxplot(outlier.shape = NA)

ggplot2 boxplot with outliers removed

Notice that ggplot2 does not automatically adjust the y-axis. To adjust the axis, you can use coord_cartesian:

ggplot(data, aes(y=y)) +
  geom_boxplot(outlier.shape = NA) +
  coord_cartesian(ylim=c(5, 30))

ggplot2 boxplot with no outliers

Additional Resources

How to Set Axis Limits in ggplot2
How to Create Side-by-Side Plots in ggplot2
A Complete Guide to the Best ggplot2 Themes

Leave a Reply

Your email address will not be published. Required fields are marked *