There are two common ways to create a distribution plot in Python:
Method 1: Create Histogram Using Matplotlib
import matplotlib.pyplot as plt
plt.hist(data, color='lightgreen', ec='black', bins=15)
Note that color controls the fill color of the bars, ec controls the edge color of the bars and bins controls the number of bins in the histogram.
Method 2: Create Histogram with Density Curve Using Seaborn
import seaborn as sns
sns.displot(data, kde=True, bins=15)
Note that kde=True specifies that a density curve should be overlaid on the histogram.
The following examples show how to use each method in practice to visualize the distribution of values in the following NumPy array:
import numpy as np
#make this example reproducible.
np.random.seed(1)
#create numpy array with 1000 values that follow normal dist with mean=10 and sd=2
data = np.random.normal(size=1000, loc=10, scale=2)
#view first five values
data[:5]
array([13.24869073, 8.77648717, 8.9436565 , 7.85406276, 11.73081526])
Example 1: Create Histogram Using Matplotlib
We can use the following code to create a histogram in Matplotlib to visualize the distribution of values in the NumPy array:
import matplotlib.pyplot as plt
#create histogram
plt.hist(data, color='lightgreen', ec='black', bins=15)
The x-axis displays the values from the NumPy array and the y-axis displays the frequency of those values.
Note that the larger the value you use for the bins argument, the more bars there will be in the histogram.
Example 2: Create Histogram with Density Curve Using Seaborn
We can use the following code to create a histogram with a density curve overlaid on it using the seaborn data visualization library:
import seaborn as sns
#create histogram with density curve overlaid
sns.displot(data, kde=True, bins=15)
The result is a histogram with a density curve overlaid on it.
The benefit of using a density curve is that it summarizes the shape of the distribution using a single continuous curve.
Note: You can find the complete documentation for the seaborn displot() function here.
Additional Resources
The following tutorials explain how to create other common charts in Python:
How to Create Stacked Bar Charts in Matplotlib
How to Create a Relative Frequency Histogram in Matplotlib
How to Create a Horizontal Barplot in Seaborn