How to Combine Columns Using dplyr


Often you may want to use functions from the dplyr package in R to combine multiple columns in a data frame into a single column.

You can use the following basic syntax to do so:

library(dplyr)

#add new column that combines values from team and pos columns
df <- df %>% mutate(info=paste(team, pos, sep = "_"))

This particular example creates a new column named info that combines the values from the team and pos columns of the data frame, using an underscore as a separator.

Note that we use the mutate() function from the dplyr package to create a new column, then we use the paste() function from base R to concatenate two values together, using the sep argument to specify the separator that should be used to concatenate the values.

The following example shows how to use this syntax in practice.

Note: You may need to first use the following syntax to install the dplyr package:

install.package('dplyr')

Once the dplyr package is installed, you can then use the various functions from it to combine multiple columns into one column.

Example: How to Combine Columns Using dplyr

Suppose we create the following data frame in R that contains information about various basketball players:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                 pos=c('G', 'F', 'F', 'G', 'G', 'F'),
                 points=c(12, 22, 35, 10, 19, 40))

#view data frame
df

  team pos points
1    A   G     12
2    A   F     22
3    A   F     35
4    B   G     10
5    B   G     19
6    B   F     40

This data frame contains the following three columns:

  • team: The team name the player belongs on
  • pos: The position of the player (G = Guard, F = Forward)
  • points: The total points scored by the player

Suppose that we would like to add a new column to the data frame named info that combines the values from the team and pos columns into one column.

We can use the following syntax to do so:

library(dplyr)

#add new column that combines values from team and pos columns
df <- df %>% mutate(info=paste(team, pos, sep = "_"))

#view updated data frame
df

  team pos points name
1    A   G     12  A_G
2    A   F     22  A_F
3    A   F     35  A_F
4    B   G     10  B_G
5    B   G     19  B_G
6    B   F     40  B_F

Notice that a new column named info has been added that combines the values from the team and pos columns, using an underscore as a separator.

Note that you can specify any value that you would like for the sep argument of the paste() function to instead use a different separator.

For example, we could use the following syntax to instead use a colon as a separator:

library(dplyr)

#add new column that combines values from team and pos columns
df <- df %>% mutate(info=paste(team, pos, sep = ":"))

#view updated data frame
df

  team pos points name
1    A   G     12  A:G
2    A   F     22  A:F
3    A   F     35  A:F
4    B   G     10  B:G
5    B   G     19  B:G
6    B   F     40  B:F

We can see that a new column named info has been added that combines the values from the team and pos columns, using a colon as a separator.

Note: You can find the complete documentation for the mutate() function from the dplyr package here.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Use slice_max() in dplyr
How to Rename Columns Using dplyr
How to Add Row to Data Frame Using dplyr
How to Use the pull() Function in dplyr

Leave a Reply

Your email address will not be published. Required fields are marked *