How to Create a Distribution Plot in Matplotlib


There are two common ways to create a distribution plot in Python:

Method 1: Create Histogram Using Matplotlib

import matplotlib.pyplot as plt

plt.hist(data, color='lightgreen', ec='black', bins=15)

Note that color controls the fill color of the bars, ec controls the edge color of the bars and bins controls the number of bins in the histogram.

Method 2: Create Histogram with Density Curve Using Seaborn

import seaborn as sns
sns.displot(data, kde=True, bins=15)

Note that kde=True specifies that a density curve should be overlaid on the histogram.

The following examples show how to use each method in practice to visualize the distribution of values in the following NumPy array:

import numpy as np

#make this example reproducible.
np.random.seed(1)

#create numpy array with 1000 values that follow normal dist with mean=10 and sd=2
data = np.random.normal(size=1000, loc=10, scale=2)

#view first five values
data[:5]

array([13.24869073,  8.77648717,  8.9436565 ,  7.85406276, 11.73081526])

Example 1: Create Histogram Using Matplotlib

We can use the following code to create a histogram in Matplotlib to visualize the distribution of values in the NumPy array:

import matplotlib.pyplot as plt

#create histogram
plt.hist(data, color='lightgreen', ec='black', bins=15)

The x-axis displays the values from the NumPy array and the y-axis displays the frequency of those values.

Note that the larger the value you use for the bins argument, the more bars there will be in the histogram.

Example 2: Create Histogram with Density Curve Using Seaborn

We can use the following code to create a histogram with a density curve overlaid on it using the seaborn data visualization library:

import seaborn as sns

#create histogram with density curve overlaid
sns.displot(data, kde=True, bins=15)

The result is a histogram with a density curve overlaid on it.

The benefit of using a density curve is that it summarizes the shape of the distribution using a single continuous curve.

Note: You can find the complete documentation for the seaborn displot() function here.

Additional Resources

The following tutorials explain how to create other common charts in Python:

How to Create Stacked Bar Charts in Matplotlib
How to Create a Relative Frequency Histogram in Matplotlib
How to Create a Horizontal Barplot in Seaborn

Leave a Reply

Your email address will not be published. Required fields are marked *