You can use the following methods with the groupby() and size() functions in pandas to count the number of occurrences by group:
Method 1: Count Occurrences Grouped by One Variable
df.groupby('var1').size()
Method 2: Count Occurrences Grouped by Multiple Variables
df.groupby(['var1', 'var2']).size()
Method 3: Count Occurrences Grouped by Multiple Variables and Sort by Count
df.groupby(['var1', 'var2']).size().sort_values(ascending=False)
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'], 'position': ['G', 'G', 'F', 'F', 'F', 'G', 'G', 'G', 'G', 'F'], 'points': [15, 22, 24, 25, 20, 35, 34, 19, 14, 12]}) #view DataFrame print(df) team position points 0 A G 15 1 A G 22 2 A F 24 3 A F 25 4 A F 20 5 B G 35 6 B G 34 7 B G 19 8 B G 14 9 B F 12
Example 1: Count Occurrences Grouped by One Variable
The following code shows how to use the groupby() and size() functions to count the occurrences of values in the team column:
#count occurrences of each value in team column
df.groupby('team').size()
team
A 5
B 5
dtype: int64
From the output we can see that the values A and B both occur 5 times in the team column.
Example 2: Count Occurrences Grouped by Multiple Variables
The following code shows how to use the groupby() and size() functions to count the occurrences of values for each combination of values in the team and position columns:
#count occurrences of values for each combination of team and position
df.groupby(['team', 'position']).size()
team position
A F 3
G 2
B F 1
G 4
dtype: int64
From the output we can see:
- Team A and position F occurs 3 times.
- Team A and position G occurs 2 times.
And so on.
Example 3: Count Occurrences Grouped by Multiple Variables and Sort
The following code shows how to use the groupby() and size() functions to count the occurrences of values for each combination of values in the team and position columns, then sort by count:
#count occurrences for each combination of team and position and sort
df.groupby(['team', 'position']).size().sort_values(ascending=False)
team position
B G 4
A F 3
G 2
B F 1
dtype: int64
The output shows the count of each combination of team and position values, sorted by count in descending order.
Note: To sort by count in ascending order, simply remove ascending=False in the sort_values() function.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Count Unique Values Using Pandas GroupBy
How to Apply Function to Pandas Groupby
How to Create Bar Plot from Pandas GroupBy