The percentile rank of a value tells us the percentage of values in a dataset that rank equal to or below a given value.
You can use the following methods to calculate percentile rank in pandas:
Method 1: Calculate Percentile Rank for Column
df['percent_rank'] = df['some_column'].rank(pct=True)
Method 2: Calculate Percentile Rank by Group
df['percent_rank'] = df.groupby('group_var')['value_var'].transform('rank', pct=True)
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B', 'B', 'B'],
'points': [2, 5, 5, 7, 9, 13, 15, 17, 22, 24, 30, 31, 38, 39]})
#view DataFrame
print(df)
team points
0 A 2
1 A 5
2 A 5
3 A 7
4 A 9
5 A 13
6 A 15
7 B 17
8 B 22
9 B 24
10 B 30
11 B 31
12 B 38
13 B 39
Example 1: Calculate Percentile Rank for Column
The following code shows how to calculate the percentile rank of each value in the points column:
#add new column that shows percentile rank of points
df['percent_rank'] = df['points'].rank(pct=True)
#view updated DataFrame
print(df)
team points percent_rank
0 A 2 0.071429
1 A 5 0.178571
2 A 5 0.178571
3 A 7 0.285714
4 A 9 0.357143
5 A 13 0.428571
6 A 15 0.500000
7 B 17 0.571429
8 B 22 0.642857
9 B 24 0.714286
10 B 30 0.785714
11 B 31 0.857143
12 B 38 0.928571
13 B 39 1.000000
Here’s how to interpret the values in the percent_rank column:
- 7.14% of the points values are equal to or less than 2.
- 17.86% of the points values are equal to or less than 5.
- 28.57% of the points values are equal to or less than 7.
And so on.
Example 2: Calculate Percentile Rank by Group
The following code shows how to calculate the percentile rank of each value in the points column, grouped by team:
#add new column that shows percentile rank of points, grouped by team
df['percent_rank'] = df.groupby('team')['points'].transform('rank', pct=True)
#view updated DataFrame
print(df)
team points percent_rank
0 A 2 0.142857
1 A 5 0.357143
2 A 5 0.357143
3 A 7 0.571429
4 A 9 0.714286
5 A 13 0.857143
6 A 15 1.000000
7 B 17 0.142857
8 B 22 0.285714
9 B 24 0.428571
10 B 30 0.571429
11 B 31 0.714286
12 B 38 0.857143
13 B 39 1.000000
Here’s how to interpret the values in the percent_rank column:
- 14.3% of the points values for team A are equal to or less than 2.
- 35.7% of the points values for team A are equal to or less than 5.
- 57.1% of the points values for team A are equal to or less than 7.
And so on.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Calculate Percent Change in Pandas
How to Calculate Cumulative Percentage in Pandas
How to Calculate Percentage of Total Within Group in Pandas