How to Calculate a Trimmed Mean in Python (With Examples)


A trimmed mean is the mean of a dataset that has been calculated after removing a specific percentage of the smallest and largest values from the dataset.

The easiest way to calculate a trimmed mean in Python is to use the trim_mean() function from the SciPy library.

This function uses the following basic syntax:

from scipy import stats

#calculate 10% trimmed mean
stats.trim_mean(data, 0.1)

The following examples show how to use this function to calculate a trimmed mean in practice.

Example 1: Calculate Trimmed Mean of Array

The following code shows how to calculate a 10% trimmed mean for an array of data:

from scipy import stats

#define data
data = [22, 25, 29, 11, 14, 18, 13, 13, 17, 11, 8, 8, 7, 12, 15, 6, 8, 7, 9, 12]

#calculate 10% trimmed mean
stats.trim_mean(data, 0.1)

12.375

The 10% trimmed mean is 12.375.

This is the mean of the dataset after the smallest 10% and largest 10% of values have been removed from the dataset.

Example 2: Calculate Trimmed Mean of Column in Pandas

The following code shows how to calculate a 5% trimmed mean for a specific column in a pandas DataFrame:

from scipy import stats
import pandas as pd

#define DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})


#calculate 5% trimmed mean of points
stats.trim_mean(df.points, 0.05) 

20.25

The 5% trimmed mean of the values in the ‘points’ column is 20.25.

This is the mean of the ‘points’ column after the smallest 5% and largest 5% of values have been removed.

Example 3: Calculate Trimmed Mean of Multiple Columns

The following code shows how to calculate a 5% trimmed mean for multiple columns in a pandas DataFrame:

from scipy import stats
import pandas as pd

#define DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})


#calculate 5% trimmed mean of 'points' and 'assists' columns
stats.trim_mean(df[['points', 'assists']], 0.05)

array([20.25,  7.75])

From the output we can see:

  • The 5% trimmed mean of the ‘points’ column is 20.25.
  • The 5% trimmed mean of the ‘assists’ column is 7.75.

Note: You can find the complete documentation for the trim_mean() function here.

Additional Resources

How to Calculate a Trimmed Mean by Hand
Trimmed Mean Calculator

Leave a Reply

Your email address will not be published.