The nchar() function in R can be used to count the length of characters in a string object.
This function uses the following basic syntax:
nchar(x, keepNA = NA)
where:
- x: Name of the string object
- keepNA: Default is to return ‘NA’ if NA is encountered. If set to TRUE, a value of 2 is returned to represent the length of ‘NA’ as a string.
The following examples show how to use this function in practice.
Example 1: Use nchar() to Count Length of Characters
Suppose we have the following data frame in R:
#create data frame
df <- data.frame(player=c('J Kidd', 'Kobe Bryant', 'Paul A. Pierce', 'Steve Nash'),
points=c(22, 34, 30, 17))
#view data frame
df
player points
1 J Kidd 22
2 Kobe Bryant 34
3 Paul A. Pierce 30
4 Steve Nash 17
The following code shows how to use the nchar() function to count the length of each string in the player column:
#create new column that counts length of characters in player column
df$player_length <- nchar(df$player)
#view updated data frame
df
player points player_length
1 J Kidd 22 6
2 Kobe Bryant 34 11
3 Paul A. Pierce 30 14
4 Steve Nash 17 10
The new column called player_length contains the length of each string in the player column.
Note that the nchar() function counts spaces and special characters as well.
For example, in the name ‘Paul A. Pierce’ the nchar() function counts the two spaces and the period along with all of the letters to get a total length of 14.
Example 2: Use nchar() with NA Values
Suppose we have the following data frame in R:
#create data frame
df <- data.frame(player=c(NA, 'Kobe Bryant', 'Paul A. Pierce', 'Steve Nash'),
points=c(22, 34, 30, 17))
#view data frame
df
player points
1 <NA> 22
2 Kobe Bryant 34
3 Paul A. Pierce 30
4 Steve Nash 17
If we use the nchar() function to count the length of each string in the player column, then a value of NA will be returned for the first row by default:
#create new column that counts length of characters in player column
df$player_length <- nchar(df$player)
#view updated data frame
df
player points player_length
1 <NA> 22 NA
2 Kobe Bryant 34 11
3 Paul A. Pierce 30 14
4 Steve Nash 17 10
However, if we use the argument keepNA=FALSE then a value of 2 will be returned for each string that is equal to NA:
#create new column that counts length of characters in player column
df$player_length <- nchar(df$player, keepNA=FALSE)
#view updated data frame
df
player points player_length
1 <NA> 22 2
2 Kobe Bryant 34 11
3 Paul A. Pierce 30 14
4 Steve Nash 17 10
Notice that a value of 2 is returned for the first player since this represents the length of ‘NA’ as a string.
Additional Resources
The following tutorials explain how to perform other common tasks in R:
How to Remove Last Character from String in R
How to Use substring Function in R
How to Use str_pad Function in R