How to Find the Median Value by Group in Pandas


You can use the following basic syntax to calculate the median value by group in pandas:

df.groupby(['group_variable'])['value_variable'].median().reset_index()

You can also use the following syntax to calculate the median value, grouped by several columns:

df.groupby(['group1', 'group2'])['value_variable'].median().reset_index()

The following examples show how to use this syntax in practice.

Example 1: Find Median Value by One Group

Suppose we have the following pandas DataFrames:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'position': ['G', 'G', 'F', 'F', 'G', 'G', 'F', 'F'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
df

	team	position points	rebounds
0	A	G	 5	11
1	A	G	 7	8
2	A	F	 7	10
3	A	F	 9	6
4	B	G	 12	6
5	B	G	 9	5
6	B	F	 9	9
7	B	F	 4	12

We can use the following code to find the median value of the ‘points’ column, grouped by team:

#calculate median points by team
df.groupby(['team'])['points'].median().reset_index()

	team	points
0	A	7.0
1	B	9.0

From the output we can see:

  • The median points scored by players on team A is 7.
  • The median points scored by players on team B is 9.

Note that we can also find the median value of two variables at once:

#calculate median points and median rebounds by team
df.groupby(['team'])[['points', 'rebounds']].median()

	team	points	rebounds
0	A	7.0	9.0
1	B	9.0	7.5

Example 2: Find Median Value by Multiple Groups

The following code shows how to find the median value of the ‘points’ column, grouped by team and position:

#calculate median points by team
df.groupby(['team', 'position'])['points'].median().reset_index()

	team	position points
0	A	F	 8.0
1	A	G	 6.0
2	B	F	 6.5
3	B	G	 10.5

From the output we can see:

  • The median points scored by players in the ‘F’ position on team A is 8.
  • The median points scored by players in the ‘G’ position on team A is 6.
  • The median points scored by players in the ‘F’ position on team B is 6.5.
  • The median points scored by players in the ‘G’ position on team B is 10.5.

Additional Resources

The following tutorials explain how to perform other common functions in pandas:

How to Find the Max Value by Group in Pandas
How to Find Sum by Group in Pandas
How to Calculate Quantiles by Group in Pandas

Leave a Reply

Your email address will not be published. Required fields are marked *