R: Replace Values in Data Frame Based on Lookup Table


Often you may want to replace values in a data frame in R based on corresponding values in a lookup table.

The easiest way to do this is by using the match and unlist functions in R with the following syntax:

#define empty data frame
new <- df

#fill data frame based on matching values from lookup table
new[] <- lookup_df$team[match(unlist(df), lookup_df$abbrev)]

Here is what this particular example does:

  • df is an existing data frame
  • lookup_df is a lookup table
  • We look up the values from the existing data frame within the abbrev column of the lookup table and return the corresponding value from the team column of the lookup table

By using this approach, we are able to replace all values in the original data frame with the correct values from the lookup table instead.

The following example shows how to use this syntax in practice.

Example: How to Replace Values in Data Frame Based on Lookup Table in R

Suppose we create the following data frame in R named df that contains information about the first, second and third team that various basketball analysts predict will win the championship this year:

#create data frame
df <- data.frame(first=c('M', 'M', 'R', 'S', 'R', 'R', 'H', 'N'),
                 second=c('R', 'S', 'M', 'R', 'S', 'M', 'N', 'H'),
                 third=c('H', 'N', 'S', 'M', 'M', 'H', 'R', 'S'))

#view data frame
df

  first second third
1     M      R     H
2     M      S     N
3     R      M     S
4     S      R     M
5     R      S     M
6     R      M     H
7     H      N     R
8     N      H     S

Currently this data frame only contains abbreviated team names.

Suppose that we would like to replace each value in the data frame with the full team name that corresponds to the team name.

Before we do so, we can create the following lookup table named lookup_df that contains the abbreviated and full team names for each team:

#create lookup table
lookup_df <- data.frame(abbrev=c('M', 'R', 'S', 'H', 'N'),
                        team=c('Mavs', 'Rockets', 'Spurs', 'Heat', 'Nets'))

#view lookup table
lookup_df

  abbrev    team
1      M    Mavs
2      R Rockets
3      S   Spurs
4      H    Heat
5      N    Nets

We can use the following syntax to replace all of the values in the original data frame with the full team names from the lookup table:

#define empty data frame
new <- df

#fill data frame based on matching values from lookup table
new[] <- lookup_df$team[match(unlist(df), lookup_df$abbrev)]

#view data frame
new

    first  second   third
1    Mavs Rockets    Heat
2    Mavs   Spurs    Nets
3 Rockets    Mavs   Spurs
4   Spurs Rockets    Mavs
5 Rockets   Spurs    Mavs
6 Rockets    Mavs    Heat
7    Heat    Nets Rockets
8    Nets    Heat   Spurs

We can see that each value from the original data frame has been replaced with the full team name from the lookup table.

For example:

  • Each occurrence of ‘M’ has been replaced with Mavs.
  • Each occurrence of ‘R’ has been replaced with Rockets.
  • Each occurrence of ‘s’ has been replaced with Spurs.

And so on.

Additional Resources

The following tutorials explain how to perform other common operations in R:

R: How to Merge Data Frames Based on Multiple Columns
R: How to Add Column to Data Frame Based on Other Columns
R: How to Replace Missing Values

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *