The str_match() function from the stringr package in R can be used to extract matched groups from a string.
This function uses the following syntax:
str_match(string, pattern)
where:
- string: Character vector
- pattern: Pattern to look for
The following examples show how to use this function in practice
Example 1: Use str_match with Vector
The following code shows how to use the str_match() function to extract matched patterns from a character vector:
library(stringr) #create vector of strings x <- c('Mavs', 'Cavs', 'Heat', 'Thunder', 'Blazers') #extract strings that contain 'avs' str_match(x, pattern='avs') [,1] [1,] "avs" [2,] "avs" [3,] NA [4,] NA [5,] NA
The result is a matrix in which each row displays the matched pattern or an NA value if the pattern was not found.
For example:
- The pattern ‘avs’ was found in the first element ‘Mavs’, so ‘avs’ was returned.
- The pattern ‘avs’ was found in the second element ‘Cavs’, so ‘avs’ was returned.
- The pattern ‘avs was not found in the third element ‘Heat’ so NA was returned.
And so on.
Example 2: Use str_match with Data Frame
Suppose we have the following data frame in R:
#create data frame
df <- data.frame(team=c('Mavs', 'Cavs', 'Heat', 'Thunder', 'Blazers'),
points=c(99, 104, 110, 103, 115))
#view data frame
df
team points
1 Mavs 99
2 Cavs 104
3 Heat 110
4 Thunder 103
5 Blazers 115
The following code shows how to use the str_match() function to add a new column to the data frame that either does or does not contain a matched pattern for each team name:
library(stringr)
#create new column
df$match <- str_match(df$team, pattern='avs')
#view updated data frame
df
team points match
1 Mavs 99 avs
2 Cavs 104 avs
3 Heat 110 <NA>
4 Thunder 103 <NA>
5 Blazers 115 <NA>
The new column titled match contains either the pattern ‘avs’ or NA, depending on whether the pattern is found in the team column.
Additional Resources
The following tutorials explain how to perform other common tasks in R:
How to Use str_replace in R
How to Use str_split in R
How to Use str_detect in R
How to Use str_count in R
How to Use str_pad in R