How to Calculate Percentile Rank in Pandas (With Examples)


The percentile rank of a value tells us the percentage of values in a dataset that rank equal to or below a given value.

You can use the following methods to calculate percentile rank in pandas:

Method 1: Calculate Percentile Rank for Column

df['percent_rank'] = df['some_column'].rank(pct=True)

Method 2: Calculate Percentile Rank by Group

df['percent_rank'] = df.groupby('group_var')['value_var'].transform('rank', pct=True)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'A', 'A',
                            'B', 'B', 'B', 'B', 'B', 'B', 'B'],
                   'points': [2, 5, 5, 7, 9, 13, 15, 17, 22, 24, 30, 31, 38, 39]})

#view DataFrame
print(df)

   team  points
0     A       2
1     A       5
2     A       5
3     A       7
4     A       9
5     A      13
6     A      15
7     B      17
8     B      22
9     B      24
10    B      30
11    B      31
12    B      38
13    B      39

Example 1: Calculate Percentile Rank for Column

The following code shows how to calculate the percentile rank of each value in the points column:

#add new column that shows percentile rank of points
df['percent_rank'] = df['points'].rank(pct=True)

#view updated DataFrame
print(df)

   team  points  percent_rank
0     A       2      0.071429
1     A       5      0.178571
2     A       5      0.178571
3     A       7      0.285714
4     A       9      0.357143
5     A      13      0.428571
6     A      15      0.500000
7     B      17      0.571429
8     B      22      0.642857
9     B      24      0.714286
10    B      30      0.785714
11    B      31      0.857143
12    B      38      0.928571
13    B      39      1.000000

Here’s how to interpret the values in the percent_rank column:

  • 7.14% of the points values are equal to or less than 2.
  • 17.86% of the points values are equal to or less than 5.
  • 28.57% of the points values are equal to or less than 7.

And so on.

Example 2: Calculate Percentile Rank by Group

The following code shows how to calculate the percentile rank of each value in the points column, grouped by team:

#add new column that shows percentile rank of points, grouped by team
df['percent_rank'] = df.groupby('team')['points'].transform('rank', pct=True)

#view updated DataFrame
print(df)

   team  points  percent_rank
0     A       2      0.142857
1     A       5      0.357143
2     A       5      0.357143
3     A       7      0.571429
4     A       9      0.714286
5     A      13      0.857143
6     A      15      1.000000
7     B      17      0.142857
8     B      22      0.285714
9     B      24      0.428571
10    B      30      0.571429
11    B      31      0.714286
12    B      38      0.857143
13    B      39      1.000000

Here’s how to interpret the values in the percent_rank column:

  • 14.3% of the points values for team A are equal to or less than 2.
  • 35.7% of the points values for team A are equal to or less than 5.
  • 57.1% of the points values for team A are equal to or less than 7.

And so on.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Calculate Percent Change in Pandas
How to Calculate Cumulative Percentage in Pandas
How to Calculate Percentage of Total Within Group in Pandas

Leave a Reply

Your email address will not be published.