How to Use gsub() in R to Replace Multiple Patterns


The gsub() function in R can be used to replace all occurrences of a certain pattern within a string in R.

To replace multiple patterns at once, you can use a nested gsub() statement:

df$col1 <- gsub('old1', 'new1',
           gsub('old2', 'new2',
           gsub('old3', 'new3', df$col1)))

However, a much faster method is the stri_replace_all_regex() function from the stringi package, which uses the following syntax:

library(stringi)

df$col1 <- stri_replace_all_regex(df$col1,
                                  pattern=c('old1', 'old2', 'old3'),
                                  replacement=c('new1', 'new2', 'new3'),
                                  vectorize=FALSE)

The following examples show how to use each method in practice.

Method 1: Replace Multiple Patterns with Nested gsub()

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(name=c('A', 'B', 'B', 'C', 'D', 'D'),
                 points=c(24, 26, 28, 14, 19, 12))

#view data frame
df

  name points
1    A     24
2    B     26
3    B     28
4    C     14
5    D     19
6    D     12 

We can use a nested gsub() statement to replace multiple patterns in the name column:

#replace multiple patterns in name column
df$name <- gsub('A', 'Andy',
           gsub('B', 'Bob',
           gsub('C', 'Chad', df$name)))

#view updated data frame
df

  name points
1 Andy     24
2  Bob     26
3  Bob     28
4 Chad     14
5    D     19
6    D     12

Notice that A, B, and C in the name column have all been replaced with new values.

Method 2: Replace Multiple Patterns with stringi

A much faster way to replace multiple patterns is by using the stri_replace_all_regex() function from the stringi package.

The following code shows how to use this function:

library(stringi)

#replace multiple patterns in name column
df$name <- stri_replace_all_regex(df$name,
                                  pattern=c('A', 'B', 'C'),
                                  replacement=c('Andy', 'Bob', 'Chad'),
                                  vectorize=FALSE)

#view updated data frame
df

  name points
1 Andy     24
2  Bob     26
3  Bob     28
4 Chad     14
5    D     19
6    D     12

Notice that the resulting data frame matches the one from the previous example.

If your data frame is even moderately large, you’ll notice that this function is much faster than the gsub() function.

Additional Resources

The following tutorials explain how to perform other common operations in R:

How to Use the replace() Function in R
How to Replace Values in R Data Frame Conditionally

Featured Posts

One Reply to “How to Use gsub() in R to Replace Multiple Patterns”

  1. The article forgets to mention that this function looks for regular expressions, not simple strings of text. The RDocumentation file says as much and the author of this post should not assume that the user knows to use regular expressions within the function, even if the function itself makes reference to regular expressions. I attempted to use it to update a series of punctuation characters, with disastrous transformations. Please update the article accordingly, and in doing so, assume the reader has no idea how regular expressions work, IE depending on what is being replaced, the user may need to use escape characters. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *