How to Do a Left Join in R (With Examples)


You can use the merge() function to perform a left join in base R:

#left join using base R
merge(df1,df2, all.x=TRUE)

You can also use the left_join() function from the dplyr package to perform a left join:

#left join using dplyr
dplyr::left_join(df2, df1)

Note: If you’re working with extremely large datasets, the left_join() function will tend to be faster than the merge() function.

The following examples show how to use each of these functions in practice with the following data frames:

#define first data frame
df1 <- data.frame(team=c('Mavs', 'Hawks', 'Spurs', 'Nets'),
                  points=c(99, 93, 96, 104))

df1

   team points
1  Mavs     99
2 Hawks     93
3 Spurs     96
4  Nets    104

#define second data frame
df2 <- data.frame(team=c('Mavs', 'Hawks', 'Spurs', 'Nets'),
                  rebounds=c(25, 32, 38, 30),
                  assists=c(19, 18, 22, 25))

df2

   team rebounds assists
1  Mavs       25      19
2 Hawks       32      18
3 Spurs       38      22
4  Nets       30      25

Example 1: Left Join Using Base R

We can use the merge() function in base R to perform a left join, using the ‘team’ column as the column to join on:

#perform left join using base R
merge(df1, df2, by='team', all.x=TRUE)

   team points rebounds assists
1 Hawks     93       32      18
2  Mavs     99       25      19
3  Nets    104       30      25
4 Spurs     96       38      22

Example 2: Left Join Using dplyr

We can use the left_join() function from the dplyr package to perform a left join, using the ‘team’ column as the column to join on:

library(dplyr)

#perform left join using dplyr 
left_join(df1, df2, by='team')

   team points rebounds assists
1  Mavs     99       25      19
2 Hawks     93       32      18
3 Spurs     96       38      22
4  Nets    104       30      25

One difference you’ll notice between these two functions is that the merge() function automatically sorts the rows alphabetically based on the column you used to perform the join.

Conversely, the left_join() function preserves the original order of the rows from the first data frame.

Additional Resources

How to Add a Column to Data Frame in R
How to Drop Columns from Data Frame in R
How to Select Specific Columns in R

Leave a Reply

Your email address will not be published. Required fields are marked *