How to Plot Categorical Data in Pandas (With Examples)


There are three common ways to visualize categorical data:

  • Bar Charts
  • Boxplots by Group
  • Mosaic Plots

The following examples show how to create each of these plots for a pandas DataFrame in Python.

Example 1: Bar Charts

The following code shows how to create a bar chart to visualize the frequency of teams in a certain pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'D'],
                   'points': [18, 22, 29, 25, 14, 11, 10, 15]})

#create bar plot to visualize frequency of each team
df['team'].value_counts().plot(kind='bar', xlabel='Team', ylabel='Count', rot=0)

plot categorical data in pandas

The x-axis displays each team name and the y-axis shows the frequency of each team in the DataFrame.

Note: The argument rot=0 tells pandas to rotate the x-axis labels to be parallel to the x-axis.

Example 2: Boxplots by Group

Grouped boxplots are a useful way to visualize a numeric variable, grouped by a categorical variable.

For example, the following code shows how to create boxplots that show the distribution of points scored, grouped by team:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'points': [18, 22, 29, 25, 14, 11, 10, 15]})

#create boxplot of points, grouped by team
df.boxplot(column=['points'], by='team', grid=False, color='black')

The x-axis displays the teams and the y-axis displays the distribution of points scored by each team.

Example 3: Mosaic Plot

A mosaic plot is a type of plot that displays the frequencies of two different categorical variables in one plot.

For example, the following code shows how to create a mosaic plot that shows the frequency of the categorical variables ‘result’ and ‘team’ in one plot:

import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
                   'result': ['W', 'L', 'L', 'W', 'W', 'L', 'L', 'W', 'W']})

#create mosaic plot
mosaic(df, ['team', 'result']);

mosaic plot for categorical data in pandas

The x-axis displays the teams and the y-axis displays the frequency of results for each team.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Use Groupby and Plot in Pandas
How to Plot Distribution of Column Values in Pandas
How to Adjust the Figure Size of a Pandas Plot

Leave a Reply

Your email address will not be published.