How to Convert Categorical Variables to Numeric in R


You can use one of the following methods to convert a categorical variable to a numeric variable in R:

Method 1: Convert One Categorical Variable to Numeric

df$var1 <- unclass(df$var1)

Method 2: Convert Multiple Categorical Variables to Numeric

df[, c('var1', 'var2')] <- sapply(df[, c('var1', 'var2')], unclass)

Method 3: Convert All Categorical Variables to Numeric

df[sapply(df, is.factor)] <- data.matrix(df[sapply(df, is.factor)])

The following examples show how to use each method with the following data frame:

#create data frame with some categorical variables
df <- data.frame(team=as.factor(c('A', 'B', 'C', 'D')),
                 conf=as.factor(c('AL', 'AL', 'NL', 'NL')),
                 win=as.factor(c('Yes', 'No', 'No', 'Yes')),
                 points=c(122, 98, 106, 115))

#view data frame
df

  team conf win points
1    A   AL Yes    122
2    B   AL  No     98
3    C   NL  No    106
4    D   NL Yes    115

Method 1: Convert One Categorical Variable to Numeric

The following code shows how to convert one categorical variable in a data frame to a numeric variable:

#convert 'team' variable to numeric
df$team <- unclass(df$team)

#view updated data frame
df

  team conf win points
1    1   AL Yes    122
2    2   AL  No     98
3    3   NL  No    106
4    4   NL Yes    115

Notice that the values for the ‘team’ variable have been converted to numeric values.

Method 2: Convert Multiple Categorical Variables to Numeric

The following code shows how to convert multiple categorical variables in a data frame to numeric variables:

#convert 'team' and 'win' variables to numeric
df[, c('team', 'win')] <- sapply(df[, c('team', 'win')], unclass)

#view updated data frame
df

  team conf win points
1    1   AL   2    122
2    2   AL   1     98
3    3   NL   1    106
4    4   NL   2    115

Notice that the values for the ‘team’  and ‘win’ variables have been converted to numeric values.

Method 3: Convert All Categorical Variables to Numeric

The following code shows how to convert all categorical variables in a data frame to numeric variables:

#convert all categorical variables to numeric
df[sapply(df, is.factor)] <- data.matrix(df[sapply(df, is.factor)])

#view updated data frame
df

  team conf win points
1    1    1   2    122
2    2    1   1     98
3    3    2   1    106
4    4    2   2    115

Notice that the values for each of the categorical variables in the data frame have been converted to numeric values.

Additional Resources

The following tutorials explain how to perform other common conversions in R:

How to Convert Date to Numeric in R
How to Convert Character to Factor in R
How to Convert Factor to Character in R

Leave a Reply

Your email address will not be published.