You can use the **stat_smooth()** function in ggplot2 to “smooth” the results of a scatterplot and gain a better understanding of the general pattern of points in a plot.

This function is extremely versatile and can be used to summarize both linear and non-linear trends in a dataset with and without standard error bars.

The following example shows how to use the **stat_smooth()** function in practice in R.

**Example: How to Use stat_smooth() in R**

For this particular example we will use the built-in mtcars dataset in R, which contains various measurements on different cars.

We can use the **head()** function to view the first six rows of this dataset:

#view first six rows of mtcars dataset head(mtcars) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Suppose that we would like to create a scatterplot to visualize the relationship between the **mpg** (miles per gallon) and **wt** (weight) of each vehicle in the data frame.

We can use the following syntax to do so:

library(ggplot2) #generate scatterplot of mpg vs wt ggplot(mtcars, aes(mpg, wt)) + geom_point()

This produces the following scatterplot:

The x-axis displays the **mpg** values while the y-axis displays the **wt** values.

Just from looking at the scatterplot we can see that there is a general trend of higher **mpg** values being associated with lower **wt** values.

To make this trend even easier to view, we can add the **stat_smooth()** argument.

We can use the following syntax to do so:

library(ggplot2) #generate scatterplot of mpg vs wt and add stat_smooth ggplot(mtcars, aes(mpg, wt)) + geom_point() + stat_smooth()

This produces the following result:

The same scatterplot points still exist in the plot, but now a “smooth” line with standard error boundaries are also shown, which captures the general trend of the data.

It’s worth noting that the default smoothing method is **loess**, which allows flexibility to capture a trend without using a straight line.

However, you can specify **method=’lm’** to instead force the smoothing method to be a linear trend.

It’s also worth noting that standard error boundaries are shown by default, but you can specify **se=FALSE** within the **stat_smooth()** function to hide these boundaries.

We can use the following syntax to add a smooth straight line with no standard error bars instead:

library(ggplot2) #generate scatterplot of mpg vs wt and add stat_smooth ggplot(mtcars, aes(mpg, wt)) + geom_point() + stat_smooth(method='lm', se=FALSE)

This produces the following result:

Notice that the **stat_smooth()** function produces a straight line this time with no standard error bars.

Note that this line also represents the “line of best fit” if we were to perform simple linear regression using these two variables.

**Note**: You can find the complete documentation for the **stat_smooth()** function in ggplot2 here.

**Additional Resources**

The following tutorials explain how to perform other common tasks in ggplot2:

How to Use scale_y_continuous in ggplot2

How to Rotate Axis Labels in ggplot2

How to Change Legend Labels in ggplot2

How to Use the ggarrange Function in R