Pandas: How to Use groupby() with size()


You can use the following methods with the groupby() and size() functions in pandas to count the number of occurrences by group:

Method 1: Count Occurrences Grouped by One Variable

df.groupby('var1').size()

Method 2: Count Occurrences Grouped by Multiple Variables

df.groupby(['var1', 'var2']).size()

Method 3: Count Occurrences Grouped by Multiple Variables and Sort by Count

df.groupby(['var1', 'var2']).size().sort_values(ascending=False)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'position': ['G', 'G', 'F', 'F', 'F', 'G', 'G', 'G', 'G', 'F'],
                   'points': [15, 22, 24, 25, 20, 35, 34, 19, 14, 12]})

#view DataFrame
print(df)

  team position  points
0    A        G      15
1    A        G      22
2    A        F      24
3    A        F      25
4    A        F      20
5    B        G      35
6    B        G      34
7    B        G      19
8    B        G      14
9    B        F      12

Example 1: Count Occurrences Grouped by One Variable

The following code shows how to use the groupby() and size() functions to count the occurrences of values in the team column:

#count occurrences of each value in team column
df.groupby('team').size()

team
A    5
B    5
dtype: int64

From the output we can see that the values A and B both occur 5 times in the team column.

Example 2: Count Occurrences Grouped by Multiple Variables

The following code shows how to use the groupby() and size() functions to count the occurrences of values for each combination of values in the team and position columns:

#count occurrences of values for each combination of team and position
df.groupby(['team', 'position']).size()

team  position
A     F           3
      G           2
B     F           1
      G           4
dtype: int64

From the output we can see:

  • Team A and position F occurs 3 times.
  • Team A and position G occurs 2 times.

And so on.

Example 3: Count Occurrences Grouped by Multiple Variables and Sort

The following code shows how to use the groupby() and size() functions to count the occurrences of values for each combination of values in the team and position columns, then sort by count:

#count occurrences for each combination of team and position and sort
df.groupby(['team', 'position']).size().sort_values(ascending=False)

team  position
B     G           4
A     F           3
      G           2
B     F           1
dtype: int64

The output shows the count of each combination of team and position values, sorted by count in descending order.

Note: To sort by count in ascending order, simply remove ascending=False in the sort_values() function.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Count Unique Values Using Pandas GroupBy
How to Apply Function to Pandas Groupby
How to Create Bar Plot from Pandas GroupBy

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *