R: How to Count Occurrences of Character in String


Often you may want to count the number of occurrences of a character in a string in R.

There are two common ways to do so:

Method 1: Count Occurrences of Character in String Using Base R

df$A_count <- lengths(regmatches(df$emp, gregexpr('A', df$emp)))

This particular example creates a new column in a data frame name A_count that counts the occurrences of the character ‘A’ in the emp column of the data frame.

Method 2: Count Occurrences of Character in String Using stringr Package

library(stringr) 

df$A_count <- str_count(df$emp,'A')

This particular example also creates a new column in a data frame name A_count that counts the occurrences of the character ‘A’ in the emp column of the data frame.

This method uses the str_count() function from the stringr package, which can sometimes be a faster method if you’re working with an extremely large data frame.

The following examples show how to use each method in practice with the following data frame in R:

#create data frame
df <- data.frame(emp=c('AA04', 'AB08', 'BHHT3', 'AAA02', 'AHHR1', 'BDDE2', 'BTE02'),
                 sales=c(120, 150, 300, 234, 298, 138, 199))

#view data frame
df

    emp sales
1  AA04   120
2  AB08   150
3 BHHT3   300
4 AAA02   234
5 AHHR1   298
6 BDDE2   138
7 BTE02   199

This particular data frame has one column named emp which contains the ID numbers of employees at some company and another column named sales that contains the number of total sales made by the employee during a given year.

Example 1: Count Occurrences of Character in String Using Base R

We can use the following syntax to count the number of occurrences of the character ‘A’ in the emp column of the data frame by using only functions from base R:

#create new column to count occurrences of 'A' in emp column
df$A_count <- lengths(regmatches(df$emp, gregexpr('A', df$emp)))

#view updated data frame
df

    emp sales A_count
1  AA04   120       2
2  AB08   150       1
3 BHHT3   300       0
4 AAA02   234       3
5 AHHR1   298       1
6 BDDE2   138       0
7 BTE02   199       0

Notice that a new column named A_count has been created that contains the number of occurrences of the character ‘A’ in the emp column.

From the output we can see:

  • The first employee ID has 2 A characters.
  • The second employee ID has 1 A character.
  • The third employee ID has 0 A characters.

And so on.

Example 2: Count Occurrences of Character in String Using stringr

We can use the following syntax to count the number of occurrences of the character ‘A’ in the emp column of the data frame by using the str_count() function from the stringr package:

library(stringr)

#create new column to count occurrences of 'A' in emp column
df$A_count <- str_count(df$emp,'A')

#view updated data frame
df

    emp sales A_count
1  AA04   120       2
2  AB08   150       1
3 BHHT3   300       0
4 AAA02   234       3
5 AHHR1   298       1
6 BDDE2   138       0
7 BTE02   199       0

This syntax creates a new column named A_count that contains the number of occurrences of the character ‘A’ in the emp column.

Notice that this returns the same results from the previous method using base R.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Check if Character is in String in R
How to Remove Last Character from String in R
How to Find Location of Character in a String in R
How to Select Columns Containing a Specific String in R

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *