How to Use nchar() Function in R


The nchar() function in R can be used to count the length of characters in a string object.

This function uses the following basic syntax:

nchar(x, keepNA = NA)

where:

  • x: Name of the string object
  • keepNA: Default is to return ‘NA’ if NA is encountered. If set to TRUE, a value of 2 is returned to represent the length of ‘NA’ as a string.

The following examples show how to use this function in practice.

Example 1: Use nchar() to Count Length of Characters

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(player=c('J Kidd', 'Kobe Bryant', 'Paul A. Pierce', 'Steve Nash'),
                 points=c(22, 34, 30, 17))

#view data frame
df

          player points
1         J Kidd     22
2    Kobe Bryant     34
3 Paul A. Pierce     30
4     Steve Nash     17

The following code shows how to use the nchar() function to count the length of each string in the player column:

#create new column that counts length of characters in player column
df$player_length <- nchar(df$player)

#view updated data frame
df

          player points player_length
1         J Kidd     22             6
2    Kobe Bryant     34            11
3 Paul A. Pierce     30            14
4     Steve Nash     17            10

The new column called player_length contains the length of each string in the player column.

Note that the nchar() function counts spaces and special characters as well.

For example, in the name ‘Paul A. Pierce’ the nchar() function counts the two spaces and the period along with all of the letters to get a total length of 14.

Example 2: Use nchar() with NA Values

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(player=c(NA, 'Kobe Bryant', 'Paul A. Pierce', 'Steve Nash'),
                 points=c(22, 34, 30, 17))

#view data frame
df

          player points
1           <NA>     22
2    Kobe Bryant     34
3 Paul A. Pierce     30
4     Steve Nash     17

If we use the nchar() function to count the length of each string in the player column, then a value of NA will be returned for the first row by default:

#create new column that counts length of characters in player column
df$player_length <- nchar(df$player)

#view updated data frame
df

          player points player_length
1           <NA>     22            NA
2    Kobe Bryant     34            11
3 Paul A. Pierce     30            14
4     Steve Nash     17            10

However, if we use the argument keepNA=FALSE then a value of 2 will be returned for each string that is equal to NA:

#create new column that counts length of characters in player column
df$player_length <- nchar(df$player, keepNA=FALSE)

#view updated data frame
df

          player points player_length
1           <NA>     22             2
2    Kobe Bryant     34            11
3 Paul A. Pierce     30            14
4     Steve Nash     17            10

Notice that a value of 2 is returned for the first player since this represents the length of ‘NA’ as a string.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Remove Last Character from String in R
How to Use substring Function in R
How to Use str_pad Function in R

Leave a Reply

Your email address will not be published. Required fields are marked *