How to Calculate Conditional Mean in Pandas (With Examples)


You can use the following syntax to calculate a conditional mean in pandas:

df.loc[df['team'] == 'A', 'points'].mean()

This calculates the mean of the ‘points’ column for every row in the DataFrame where the ‘team’ column is equal to ‘A.’

The following examples show how to use this syntax in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B'],
                   'points': [99, 90, 93, 86, 88, 82],
                   'assists': [33, 28, 31, 39, 34, 30]})

#view DataFrame
print(df)

  team  points  assists
0    A      99       33
1    A      90       28
2    A      93       31
3    B      86       39
4    B      88       34
5    B      82       30

Example 1: Calculate Conditional Mean for Categorical Variable

The following code shows how to calculate the mean of the ‘points’ column for only the rows in the DataFrame where the ‘team’ column has a value of ‘A.’

#calculate mean of 'points' column for rows where team equals 'A'
df.loc[df['team'] == 'A', 'points'].mean()

94.0

The mean value in the ‘points’ column for the rows where ‘team’ is equal to ‘A’ is 94.

We can manually verify this by calculating the average of the points values for only the rows where ‘team’ is equal to ‘A’:

  • Average of Points: (99 + 90 + 93) / 3 = 94

Example 2: Calculate Conditional Mean for Numeric Variable

The following code shows how to calculate the mean of the ‘assists’ column for only the rows in the DataFrame where the ‘points’ column has a value greater than or equal to 90.

#calculate mean of 'assists' column for rows where 'points' >= 90
df.loc[df['points'] >= 90, 'assists'].mean()

30.666666666666668

The mean value in the ‘assists’ column for the rows where ‘points’ is greater than or equal to 90 is 30.66667.

We can manually verify this by calculating the average of the points values for only the rows where ‘team’ is equal to ‘A’:

  • Average of Assists: (33 + 28 + 31) / 3 = 30.66667

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Calculate the Mean of Columns in Pandas
How to Calculate a Rolling Mean in Pandas
How to Fill NaN Values with Mean in Pandas

Leave a Reply

Your email address will not be published.