How to Subset Data Frame by List of Values in R


You can use one of the following methods to subset a data frame by a list of values in R:

Method 1: Use Base R

df_new <- df[df$my_column %in% vals,]

Method 2: Use dplyr

library(dplyr)

df_new <- filter(df, my_column %in% vals)

Method 3: Use data.table

library(data.table)

df_new <- setDT(df, key='my_column')[J(vals)]

The following examples show how to use each of these methods in practice with the following data frame in R:

#create data frame
df <- data.frame(team=c('A', 'B', 'B', 'B', 'C', 'C', 'C', 'D'),
                 points=c(12, 22, 35, 34, 20, 28, 30, 18),
                 assists=c(4, 10, 11, 12, 12, 8, 6, 10))

#view data frame
df

  team points assists
1    A     12       4
2    B     22      10
3    B     35      11
4    B     34      12
5    C     20      12
6    C     28       8
7    C     30       6
8    D     18      10

Method 1: Subset Data Frame by List of Values in Base R

The following code shows how to subset the data frame to only contain rows that have a value of ‘A’ or ‘C’ in the team column:

#define values to subset by
vals <- c('A', 'C')

#subset data frame to only contain rows where team is 'A' or 'C'
df_new <- df[df$team %in% vals,]

#view results
df_new

  team points assists
1    A     12       4
5    C     20      12
6    C     28       8
7    C     30       6

The resulting data frame only contains rows that have a value of ‘A’ or ‘C’ in the team column.

Note that we used functions from base R in this example so we didn’t have to load any extra packages.

Method 2: Subset Data Frame by List of Values in dplyr

The following code shows how to subset the data frame to only contain rows that have a value of ‘A’ or ‘C’ in the team column by using the filter() function from the dplyr package:

library(dplyr)

#define values to subset by
vals <- c('A', 'C')

#subset data frame to only contain rows where team is 'A' or 'C'
df_new <- filter(df, team %in% vals)

#view results
df_new

  team points assists
1    A     12       4
5    C     20      12
6    C     28       8
7    C     30       6

The resulting data frame only contains rows that have a value of ‘A’ or ‘C’ in the team column.

Method 3: Subset Data Frame by List of Values in data.table

The following code shows how to subset the data frame to only contain rows that have a value of ‘A’ or ‘C’ in the team column by using functions from the data.table package:

library(data.table)

#define values to subset by
vals <- c('A', 'C')

#subset data frame to only contain rows where team is 'A' or 'C'
df_new <- setDT(df, key='team')[J(vals)]

#view results
df_new

   team points assists
1:    A     12       4
2:    C     20      12
3:    C     28       8
4:    C     30       6

The resulting data frame only contains rows that have a value of ‘A’ or ‘C’ in the team column.

Related: How to Use %in% Operator in R (With Examples)

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Subset Data Frame by Factor Levels in R
How to Subset by a Date Range in R
How to Plot Subset of a Data Frame in R

Leave a Reply

Your email address will not be published. Required fields are marked *