How to Count Unique Values Using Pandas GroupBy


You can use the following basic syntax to count the number of unique values by group in a pandas DataFrame:

df.groupby('group_column')['count_column'].nunique()

The following examples show how to use this syntax with the following DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'position': ['G', 'G', 'G', 'F', 'F', 'G', 'G', 'F', 'F', 'F'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4, 7, 7],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 13, 15]})

#view DataFrame
df

	team	position points	rebounds
0	A	G	 5	11
1	A	G	 7	8
2	A	G	 7	10
3	A	F	 9	6
4	A	F	 12	6
5	B	G	 9	5
6	B	G	 9	9
7	B	F	 4	12
8	B	F	 7	13
9	B	F	 7	15

Example 1: Group By One Column & Count Unique Values

The following code shows how to count the number of unique values in the ‘points’ column for each team:

#count number of unique values in 'points' column grouped by 'team' column
df.groupby('team')['points'].nunique()

team
A    4
B    3
Name: points, dtype: int64

From the output we can see:

  • There are 4 unique ‘points’ values for team A.
  • There are 3 unique ‘points’ values for team B.

Note that we can also use the unique() function to display each unique ‘points’ value by team:

#display unique values in 'points' column grouped by 'team'
df.groupby('team')['points'].unique()

team
A    [5, 7, 9, 12]
B        [9, 4, 7]
Name: points, dtype: object

Example 2: Group By Multiple Columns & Count Unique Values

The following code shows how to count the number of unique values in the ‘points’ column, grouped by team and position:

#count number of unique values in 'points' column grouped by 'team' and 'position'
df.groupby(['team', 'position'])['points'].nunique()

team  position
A     F           2
      G           2
B     F           2
      G           1
Name: points, dtype: int64

From the output we can see:

  • There are 2 unique ‘points’ values for players in position ‘F’ on team A.
  • There are 2 unique ‘points’ values for players in position ‘G’ on team A.
  • There are 2 unique ‘points’ values for players in position ‘F’ on team B.
  • There is1 unique ‘points’ value for players in position ‘G’ on team B.

Once again, we can use the unique() function to display each unique ‘points’ value by team and position:

#display unique values in 'points' column grouped by 'team' and 'position'
df.groupby(['team', 'position'])['points'].unique()

team  position
A     F           [9, 12]
      G            [5, 7]
B     F            [4, 7]
      G               [9]
Name: points, dtype: object

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Pandas: How to Find Unique Values in a Column
Pandas: How to Find Unique Values in Multiple Columns
Pandas: How to Count Occurrences of Specific Value in Column

Leave a Reply

Your email address will not be published.