How to Extract String Before Space in R


You can use the following methods to extract a string before a whitespace in R:

Method 1: Extract String Before Space Using Base R

gsub( " .*$", "", my_string)

Method 2: Extract String Before Space Using stringr R

library(stringr)

word(my_string, 1)

Both of these examples extract the string before the first space in the string called my_string.

The following examples show how to use each method in practice with the following data frame:

#create data frame
df <- data.frame(athlete=c('A', 'B', 'C', 'D'),
                 distance=c('23.2 miles', '14 miles', '5 miles', '9.3 miles'))

#view data frame
df

  athlete   distance
1       A 23.2 miles
2       B   14 miles
3       C    5 miles
4       D  9.3 miles

Example 1: Extract String Before Space Using Base R

The following code shows how to extract the string before the space in each string in the distance column of the data frame:

#create new column that extracts string before space in distance column
df$distance_amount <- gsub( " .*$", "", df$distance) 

#view updated data frame
df

  athlete   distance distance_amount
1       A 23.2 miles            23.2
2       B   14 miles              14
3       C    5 miles               5
4       D  9.3 miles             9.3

Notice that the new column called distance_amount contains the string before the space in the strings in the distance column of the data frame.

Related: An Introduction to gsub() in R

Example 2: Extract String Before Space Using stringr Package

The following code shows how to extract the string before the space in each string in the distance column of the data frame by using the word() function from the stringr package in R:

library(stringr)

#create new column that extracts string before space in distance column
df$distance_amount <- word(df$distance, 1)

#view updated data frame
df

  athlete   distance distance_amount
1       A 23.2 miles            23.2
2       B   14 miles              14
3       C    5 miles               5
4       D  9.3 miles             9.3

Notice that the new column called distance_amount contains the string before the space in the strings in the distance column of the data frame.

This matches the results from using the gsub() function in base R.

Note that the word() function from the stringr package extracts words from a given string.

By supply the value 1 to this function, we’re able to extract the first word found in a string which is the equivalent of extracting the string before the first space.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Extract String After Specific Character in R
How to Extract String Between Specific Characters in R
How to Remove Characters from String in R
How to Find Location of Character in a String in R

Leave a Reply

Your email address will not be published. Required fields are marked *