How to Calculate Percentile Rank in R (2 Examples)


The percentile rank of a value tells us the percentage of values in a dataset that rank equal to or below a given value.

You can use the following methods to calculate percentile rank in R:

Method 1: Calculate Percentile Rank for Entire Dataset

library(dplyr)

df %>%
  mutate(percent_rank = rank(x)/length(x))

Method 2: Calculate Percentile Rank by Group

library(dplyr)

df %>%
  group_by(group_var) %>%
  mutate(percent_rank = rank(x)/length(x))

The following examples show how to use each method in practice with the following data frame:

#create data frame
df <- data.frame(team=rep(c('A', 'B'), each=7),
                 points=c(2, 5, 5, 7, 9, 13, 15, 17, 22, 24, 30, 31, 38, 39))

#view data frame
df

   team points
1     A      2
2     A      5
3     A      5
4     A      7
5     A      9
6     A     13
7     A     15
8     B     17
9     B     22
10    B     24
11    B     30
12    B     31
13    B     38
14    B     39

Example 1: Calculate Percentile Rank for Entire Dataset

The following code shows how to use functions from the dplyr package in R to calculate the percentile rank of each value in the points column:

library(dplyr)

#calculate percentile rank of points values
df %>%
  mutate(percent_rank = rank(points)/length(points))

   team points percent_rank
1     A      2   0.07142857
2     A      5   0.17857143
3     A      5   0.17857143
4     A      7   0.28571429
5     A      9   0.35714286
6     A     13   0.42857143
7     A     15   0.50000000
8     B     17   0.57142857
9     B     22   0.64285714
10    B     24   0.71428571
11    B     30   0.78571429
12    B     31   0.85714286
13    B     38   0.92857143
14    B     39   1.00000000

Here’s how to interpret the values in the percent_rank column:

  • 7.14% of the points values are equal to or less than 2.
  • 17.86% of the points values are equal to or less than 5.
  • 28.57% of the points values are equal to or less than 7.

And so on.

Example 2: Calculate Percentile Rank by Group

The following code shows how to use functions from the dplyr package in R to calculate the percentile rank of each value in the points column, grouped by team:

library(dplyr)

#calculate percentile rank of points values grouped by team
df %>%
  group_by(team) %>%
  mutate(percent_rank = rank(points)/length(points))

# A tibble: 14 x 3
# Groups:   team [2]
   team  points percent_rank
             
 1 A          2        0.143
 2 A          5        0.357
 3 A          5        0.357
 4 A          7        0.571
 5 A          9        0.714
 6 A         13        0.857
 7 A         15        1    
 8 B         17        0.143
 9 B         22        0.286
10 B         24        0.429
11 B         30        0.571
12 B         31        0.714
13 B         38        0.857
14 B         39        1   

Here’s how to interpret the values in the percent_rank column:

  • 14.3% of the points values for team A are equal to or less than 2.
  • 35.7% of the points values for team A are equal to or less than 5.
  • 57.1% of the points values for team A are equal to or less than 7.

And so on.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Calculate Percentiles in R
How to Calculate Quartiles in R
How to Calculate Quantiles by Group in R

Leave a Reply

Your email address will not be published.