How to Use str_split in R (With Examples)


The str_split() function from the stringr package in R can be used to split a string into multiple pieces. This function uses the following syntax:

str_split(string, pattern)

where:

  • string: Character vector
  • pattern: Pattern to split on

Similarly, the str_split_fixed() function from the stringr package can be used to split a string into a fixed number of pieces. This function uses the following syntax:

str_split_fixed(string, pattern, n)

where:

  • string: Character vector
  • pattern: Pattern to split on
  • n: Number of pieces to return

This tutorial provides examples of how to use each of these functions on the following data frame:

#create data frame
df <- data.frame(team=c('andy & bob', 'carl & doug', 'eric & frank'),
                 points=c(14, 17, 19))

#view data frame
df

          team points
1   andy & bob     14
2  carl & doug     17
3 eric & frank     19

Example 1: Split String Using  str_split()

The following code shows how to split the string in the “team” column using the str_split() function:

library(stringr)

#split the string in the team column on " & "
str_split(df$team, " & ")

[[1]]
[1] "andy" "bob" 

[[2]]
[1] "carl" "doug"

[[3]]
[1] "eric"  "frank"

The result is a list of three elements that show the individual player names on each team.

Example 2: Split String Using  str_split_fixed()

The following code shows how to split the string in the “team” column into two fixed pieces using the str_split_fixed() function:

library(stringr)

#split the string in the team column on " & "
str_split_fixed(df$team, " & ", 2)

     [,1]   [,2]   
[1,] "andy" "bob"  
[2,] "carl" "doug" 
[3,] "eric" "frank"

The result is a matrix with two columns and three rows.

Once useful application of the str_split_fixed() function is to append the resulting matrix to the end of the data frame. For example:

library(stringr)

#split the string in the team column and append resulting matrix to data frame
df[ , 3:4] <- str_split_fixed(df$team, " & ", 2)

#view data frame
df
          team points   V3    V4
1   andy & bob     14 andy   bob
2  carl & doug     17 carl  doug
3 eric & frank     19 eric frank

The column titled ‘V3’ shows the name of the first player on the team and the column titled ‘V4’ shows the name of the second player on the team.

Additional Resources

How to Use str_replace in R
How to Perform Partial String Matching in R
How to Convert Strings to Dates in R
How to Convert Character to Numeric in R

Leave a Reply

Your email address will not be published. Required fields are marked *