Pandas: Create Frequency Table Based on Multiple Columns


You can use the following basic syntax to create a frequency table in pandas based on multiple columns:

df.value_counts(['column1', 'column2'])

The following example shows how to use this syntax in practice.

Example: Create Frequency Table in Pandas Based on Multiple Columns

Suppose we have the following pandas DataFrame that contains information on team name, position, and points scored by various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team' : ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'position' : ['G', 'G', 'G', 'F', 'G', 'G', 'F', 'F'],
                   'points': [24, 33, 20, 15, 16, 16, 29, 25]})

#view DataFrame
print(df)

  team position  points
0    A        G      24
1    A        G      33
2    A        G      20
3    A        F      15
4    B        G      16
5    B        G      16
6    B        F      29
7    B        F      25

We can use the value_counts() function to create a frequency table that shows the occurrence of each combination of values in the team and position columns:

#count frequency of values in team and position columns
df.value_counts(['team', 'position'])

team  position
A     G           3
B     F           2
      G           2
A     F           1
dtype: int64

From the results we can see:

  • There are 3 occurrences of team A and position G
  • There are 2 occurrences of team B and position F
  • There are 2 occurrences of team B and position G
  • There is 1 occurrence of team A and position F

Note that we can use reset_index() to return a DataFrame as a result instead:

#count frequency of values in team and position columns and return DataFrame
df.value_counts(['team', 'position']).reset_index()

        team	position  0
0	A	G	  3
1	B	F	  2
2	B	G	  2
3	A	F	  1

We can use the rename() function to rename the column that contains the counts:

#get frequency of values in team and position column and rename count column
df.value_counts(['team', 'position']).reset_index().rename(columns={0:'count'})

        team	position  count
0	A	G	  3
1	B	F	  2
2	B	G	  2
3	A	F	  1

The end result is a DataFrame that contains the frequency of each unique combination of values in the team and position columns.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

Pandas: How to Use GroupBy and Value Counts
Pandas: How to Use GroupBy with Bin Counts
Pandas: How to Count Values in Column with Condition

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *