R: How to Use Case-Insensitive grep()


You can use the grep() function in R to find elements in a vector that match a particular pattern.

By default, the grep() function is case-sensitive and will only find patterns that match based on both characters and the case of the characters.

However, you may often want to use the grep() function for case-insensitive matching.

To do so, you can use the ignore.case=TRUE argument. This tells grep() to ignore the case of the characters when searching for matching patterns.

You can use the following basic syntax to use this argument in practice.

#create new data frame that contains rows that match Mavs in team column
df_new <- df[grep('Mavs', df$team, ignore.case=TRUE), ]

This particular example will create a new data frame that contains all rows from the original data frame that contain the string Mavs in the team column of the original data frame, regardless of case.

The following example shows how to use this syntax in practice.

Related: R: How to Use grep() to Find Exact Match

Example: How to Use grep() with OR Logic in R

Suppose we create the following data frame in R that contains information about various basketball teams:

#create data frame
df <- data.frame(team=c('Mavs', 'Hawks', 'Nets', 'Heat', 'mavs', 'MAVS', 'Kings'),
                 points=c(104, 115, 124, 120, 112, 140, 112),
                 status=c('Bad', 'Good', 'Excellent', 'Great', 'Bad', 'Great', 'Bad'))

#view data frame
df

   team points    status
1  Mavs    104       Bad
2 Hawks    115      Good
3  Nets    124 Excellent
4  Heat    120     Great
5  mavs    112       Bad
6  MAVS    140     Great
7 Kings    112       Bad

Suppose that we would like to use the grep() function to extract each row from the data frame that contains the string Mavs in the team column.

Suppose we use the following syntax to do so:

#create new data frame that contains rows that match Mavs in team column
df_new <- df[grep('Mavs', df$team), ]

#view new data frame
df_new

  team points status
1 Mavs    104    Bad

Notice that the new data frame only contains the one row that matches the exact string Mavs in the team column.

By default, the grep() function uses case-sensitive matching so it does not match either row that contains MAVS or mavs in the team column since the case doesn’t match.

To perform a case-insensitive match with the grep() function, we can specify ignore.case=TRUE as follows:

#create new data frame that contains rows that match one of several patterns
df_new <- df[grep('Mavs', df$team, ignore.case=TRUE), ]

#view new data frame
df_new

  team points status
1 Mavs    104    Bad
5 mavs    112    Bad
6 MAVS    140  Great

Notice that the new data frame contains all rows that match the pattern Mavs in the team column of the original data frame, regardless of whether or not the case matches.

Specifically, we can see that the new data frame contains the rows with the following values in the team column:

  • Mavs
  • mavs
  • MAVS

Each of these values matches the pattern Mavs, regardless of the case.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Concatenate Vector of Strings in R
How to Extract Numbers from Strings in R
How to Remove Spaces from Strings in R
How to Compare Strings in R

Leave a Reply

Your email address will not be published. Required fields are marked *