How to Create a Density Plot in R Using ggplot2

R Guides

In statistics, we’re often interested in understanding how a particular variable is distributed in a dataset. One common way to visualize the distribution of a variable is to create a histogram, which uses bars to represent frequencies of data values.

For example, consider the built-in ggplot2 dataset diamonds:

#load ggplot2
library(ggplot2)

#view first six rows of diamonds dataset
head(diamonds)

#  carat cut       color clarity depth table price     x     y     z        
#1 0.23  Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43
#2 0.21  Premium   E     SI1      59.8    61   326  3.89  3.84  2.31
#3 0.23  Good      E     VS1      56.9    65   327  4.05  4.07  2.31
#4 0.290 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63
#5 0.31  Good      J     SI2      63.3    58   335  4.34  4.35  2.75
#6 0.24  Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48

To visualize the distribution of the variable depth, we could create a histogram:

ggplot(diamonds, aes(x = depth)) + 
  geom_histogram(color = 'black', fill = 'steelblue')

Histogram example in R using ggplot2

This allows us to easily see that most of the values for depth are concentrated around 63, with some values as low as about 55 and some values as high as almost 70. In general, we can see that the distribution is fairly concentrated and not very spread out.

By default, ggplot2 uses 30 equally-sized bins to create the histograms. However, we could increase the number of bins to 60 to get an even more granular view of the distribution:

ggplot(diamonds, aes(x = depth)) + 
  geom_histogram(color = 'black', fill = 'steelblue', bins = 60)

Histogram in R with 60 bins

Or we could increase it to 120 bins to get an even more in-depth look at the distribution:

ggplot(diamonds, aes(x = depth)) + 
  geom_histogram(color = 'black', fill = 'steelblue', bins = 120)

Density plot in R with 120 bins

Notice that the more bins we use, the “smoother” the histogram begins to appear. In fact, we could represent this distribution with one smooth continuous curve by simply using a density plot, which produces a curve that shows the distribution of the data using a method known as kernel density estimation.

The following code illustrates how to create a density plot for the variable depth by using the geom_density() function in ggplot2:

ggplot(diamonds, aes(x = depth)) + 
  geom_density()

Basic density plot in ggplot2 using R

Notice that this plot has the same shape as the histograms we showed before, but we’re able to see this shape through the use of one single curve, as opposed to several bars.

Modifying the Aesthetics of a Density Plot in R

We’ve seen how to create a basic density plot. Now we’ll show how to modify the aesthetics of the plot to add titles, axis labels, axis ticks, colors, legends, and more.

Adding Titles & Labels

We can add both titles and axis labels to the density plot using the labs() argument:

ggplot(diamonds, aes(x = depth)) + 
  geom_density() +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density')

Density plot in R with titles and labels

Modifying the Number of Axis Tick Marks

We can also modify the number of tick marks that show up on the x-axis by using the scale_x_continuous() argument.

The following  code shows how to add a tick mark at every 5th number on the x-axis:

ggplot(diamonds, aes(x = depth)) + 
  geom_density() +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density') +
  scale_x_continuous(breaks = seq(40, 80, 5)) #add tick mark at every 5th number

Density plot with modified tick marks in R using ggplot2

Or we could display tick marks even more frequently, at every second number on the x-axis:

ggplot(diamonds, aes(x = depth)) + 
  geom_density() +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density') +
  scale_x_continuous(breaks = seq(40, 80, 2)) #add tick mark at every 2nd number

Density plot with tick marks on every second number in ggplot2

Adding Colors

We can easily modify the fill color and the outline color of the density plot by using the following code:

ggplot(diamonds, aes(x = depth)) + 
  geom_density(fill = 'steelblue', color = 'black') +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density')

Density plot with fill and outline colors in R

We could also use hex codes for the fill and outline colors as well:

ggplot(diamonds, aes(x = depth)) + 
  geom_density(fill = '#8ff291', color = '#000000') +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density')

Density plot in R with hex color codes

In addition, we can specify how transparent we’d like the fill color to be in the chart by using the alpha argument, which ranges from 0 to 1.

ggplot(diamonds, aes(x = depth)) + 
  geom_density(fill = '#8ff291', color = '#000000', alpha = 0.4) +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density')

Density plot with adjusted alpha value for fill color in ggplot2

Adding Lines

It’s also easy to add lines on the density plot. For example, the following code illustrates how to place a dashed line at the median value of the distribution:

#find median depth value
median_depth <- median(diamonds$depth)

#create density plot with dashed line at the median
ggplot(diamonds, aes(x = depth)) + 
  geom_density(fill = '#8ff291', color = '#000000', alpha = 0.4) +
  geom_vline(xintercept = median_depth, linetype = 'dashed') +
  labs(title = 'Density Plot of Diamond Depths', x = 'Depth', y = 'Density')

Density plot in ggplot2 with line at the median

Multiple Density Plots

It’s possible to create multiple density plots at once by using the facet_wrap() argument. For example, we can create one density plot for each of the five categories of cut in the datase by using the following code: 

ggplot(diamonds, aes(x = depth)) + 
  geom_density(fill = '#8ff291', color = '#000000') +
  facet_wrap(~ cut) +
  labs(title = 'Density Plots of Diamond Depths by Cut', x = 'Depth', y = 'Density')

Density plots with face_wrap in ggplot2

This gives us an easy way to see how the distribution of depth varies based on the value of cut. For example, we see that the values of depth are quite concentrated for the “Ideal” cut, but they’re much more spread out for the “Fair” cut.

We can also specify how many columns we would like the facet_wrap() to use. For example, we could place all five plots side by side by specifying the number of columns to be equal to five:

ggplot(diamonds, aes(x = depth)) + 
  geom_density(fill = '#8ff291', color = '#000000') +
  facet_wrap(~ cut, ncol = 5) +
  labs(title = 'Density Plots of Diamond Depths by Cut', x = 'Depth', y = 'Density')

Facet_wrap with density plots in ggplot2

Alternatively, we could simply place all five of the density plots on the same chart by setting the fill value equal to the variable cut and by specifying position = ‘identity’ and alpha = 0.4 within the geom_density() argument:

ggplot(diamonds, aes(x = depth, fill = cut)) + 
  geom_density(position = 'identity", alpha = 0.4) +
  labs(title = 'Density Plots of Diamond Depths by Cut', x = 'Depth', y = 'Density')

Multiple density plots in one chart in R

Since it’s a bit difficult to differentiate between all of the individual density plots, we could set the fill to be empty and simply specify the outline colors to be equal to cut:

ggplot(diamonds, aes(x = depth, color = cut)) + 
  geom_density(position="identity", alpha=0.4, fill = NA) +
  labs(title = 'Density Plots of Diamond Depths by Cut', x = 'Depth', y = 'Density')

Density plots with no fill in ggplot2

Changing Themes

Lastly, we can change the theme of the density plot if we’d like.

For example, we could use the theme_classic() to remove the grey background in the chart:

ggplot(diamonds, aes(x = depth, color = cut)) + 
  geom_density(position="identity", alpha=0.4, fill = NA) +
  labs(title = 'Density Plots of Diamond Depths by Cut', x = 'Depth', y = 'Density') + 
  theme_classic()

Density plots with theme_classic in ggplot2

Or we could use the theme_bw() to add a black border around the chart and grid lines in the background:

ggplot(diamonds, aes(x = depth, color = cut)) + 
  geom_density(position="identity", alpha=0.4, fill = NA) +
  labs(title = 'Density Plots of Diamond Depths by Cut', x = 'Depth', y = 'Density') + 
  theme_bw()

Density plot with theme_bw in ggplot2

For a complete list of built-in ggplot2 themes, check out the complete ggplot2 documentation.

We could also use the ggthemes library to access even more themes. For example, the following code illustrates how to use theme_wsj, which is supposed to mimic the style of charts seen in The Wall Street Journal:

#load ggthemes library
library(ggthemes)

#add theme_wsj() to density plot chart
ggplot(diamonds, aes(x = depth, color = cut)) + 
  geom_density(position="identity", alpha=0.4, fill = NA) +
  labs(title = 'Diamond Depths') + 
  theme_wsj()

Density plot in R with wall street journal theme

For a complete list of ggthemes, check out the documentation page.

Leave a Reply

Your email address will not be published. Required fields are marked *