How to Use slice_min() in dplyr


Often you may want to select the rows with the smallest value in a particular column of a data frame in R.

Fortunately this is easy to do by using the slice_min() function from the dplyr package in R, which is designed to perform this exact task.

The slice_min() function uses the following basic syntax:

slice_min(.data, order_by, n, …)

where:

  • .data: The name of the data frame
  • order_by: Variable or function of variables to order by
  • n: The number of rows to select

Note that if you don’t specify a value for the n argument then the slice_min() function will return the one row with the smallest value in a particular variable by default.

The following example shows how to use the slice_min() function from the dplyr package to select the row(s) with the smallest value in a particular variable in practice.

Example: How to Use slice_min() in dplyr

Suppose we create the following data frame that contains information about various basketball players:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 points=c(99, 68, 86, 88, 95, 74, 78, 93),
                 assists=c(22, 28, 45, 35, 34, 45, 28, 31),
                 rebounds=c(30, 28, 24, 24, 30, 36, 30, 29))

#view data frame
df

  team points assists rebounds
1    A     99      22       30
2    A     68      28       28
3    A     86      45       24
4    A     88      35       24
5    B     95      34       30
6    B     74      45       36
7    B     78      28       30
8    B     93      31       29

Suppose that we would like to select the row with the smallest value in the assists column of the data frame.

We can use the following syntax to do so:

library(dplyr)

#select row with smallest value in assists column
df %>% slice_min(assists)

  team points assists rebounds
1    A     99      22       30

This returns the one row that has the smallest value in the assists column of the data frame, which turns out to be the row with a value of 22.

Note that we can specify a different value for the n argument of the function to instead return the n smallest rows.

For example, we can use the following syntax to return the rows with the 5 smallest values in the assists column:

library(dplyr)

#select rows with 5 smallest values in assists column
df %>% slice_min(assists, n=5)

  team points assists rebounds
1    A     99      22       30
2    A     68      28       28
3    B     78      28       30
4    A     86      31       24
5    B     93      31       29

This returns the five rows with the smallest values in the assists column of the data frame.

It’s worth noting that if multiple rows are tied for the smallest value then both rows will be returned.

For example, suppose we attempt to select the one row with the smallest value in the rebounds column of the data frame:

library(dplyr)

#select row with smallest value in rebounds column
df %>% slice_min(rebounds)

  team points assists rebounds
1    A     86      31       24
2    A     88      35       24

This returns two rows because there are two rows that are tied with the smallest value (24) in the rebounds column.

Keep this in mind when using the slice_min() function.

Note: You can find the complete documentation for the slice_min() function from the dplyr package here.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Insert Row into Data Frame in R
How to Append Values to List in R
How to Convert Data Frame Column to List in R
How to Count Number of Elements in List in R

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *