# How to Use slice_max() in dplyr

Often you may want to select the rows with the largest value in a particular column of a data frame in R.

Fortunately this is easy to do by using the slice_max() function from the dplyr package in R, which is designed to perform this exact task.

The slice_max() function uses the following basic syntax:

slice_max(.data, order_by, n, …)

where:

• .data: The name of the data frame
• order_by: Variable or function of variables to order by
• n: The number of rows to select

Note that if you don’t specify a value for the n argument then the slice_max() function will return the one row with the largest value in a particular variable by default.

The following example shows how to use the slice_max() function from the dplyr package to select the row(s) with the largest value in a particular variable in practice.

## Example: How to Use slice_max() in dplyr

Suppose we create the following data frame that contains information about various basketball players:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
points=c(99, 68, 86, 88, 95, 74, 78, 93),
assists=c(22, 28, 45, 35, 34, 45, 28, 31),
rebounds=c(30, 28, 24, 24, 30, 36, 30, 29))

#view data frame
df

team points assists rebounds
1    A     99      22       30
2    A     68      28       28
3    A     86      45       24
4    A     88      35       24
5    B     95      34       30
6    B     74      45       36
7    B     78      28       30
8    B     93      31       29

Suppose that we would like to select the row with the largest value in the rebounds column of the data frame.

We can use the following syntax to do so:

library(dplyr)

#select row with largest value in rebounds column
df %>% slice_max(rebounds)

team points assists rebounds
1    B     74      45       36

This returns the one row that has the largest value in the rebounds column of the data frame, which turns out to be the row with a value of 36.

Note that we can specify a different value for the n argument of the function to instead return the n largest rows.

For example, we can use the following syntax to return the rows with the 5 largest values in the rebounds column:

library(dplyr)

#select rows with 5 largest value in rebounds column
df %>% slice_max(rebounds, n=5)

team points assists rebounds
1    B     74      45       36
2    A     99      22       30
3    B     95      34       30
4    B     78      28       30
5    B     93      31       29

This returns the five rows with the largest values in the rebounds column of the data frame.

It’s worth noting that if two rows are tied for the largest value then both rows will be returned.

For example, suppose we attempt to select the one row with the highest value in the assists column of the data frame:

library(dplyr)

#select row with largest value in assists column
df %>% slice_max(assists)

team points assists rebounds
1    A     86      45       24
2    B     74      45       36

Notice that this returns two rows because there are two rows that are tied with the largest value (45) in the assists column.

Keep this in mind when using the slice_max() function.

Note: You can find the complete documentation for the slice_max() function from the dplyr package here.