How to Fix in R: invalid model formula in ExtractVars


One error you may encounter in R is:

Error in terms.formula(formula, data = data) : 
  invalid model formula in ExtractVars

This error occurs when you attempt to fit a decision tree in R and incorrectly specify one or more of the variables in the formula.

This tutorial shares exactly how to fix this error in practice.

How to Reproduce the Error

Suppose we create the following data frame in R:

#create data frame
df <- data.frame(rating=c(88, 94, 99, 90, 76, 78, 81, 88),
                 points=c(14, 17, 22, 24, 25, 22, 29, 31),
                 assists=c(7, 7, 6, 12, 10, 11, 17, 2),
                 rebounds=c(7, 8, 8, 12, 9, 5, 11, 15))

#view data frame
df

  rating points assists rebounds
1     88     14       7        7
2     94     17       7        8
3     99     22       6        8
4     90     24      12       12
5     76     25      10        9
6     78     22      11        5
7     81     29      17       11
8     88     31       2       15

Now suppose we attempt to use the rpart() function to fit a decision tree model to the data:

library(rpart)

#attempt to fit decision tree model to data
model <- rpart(rating ~ "points" + "assists" + "rebounds", data = df)

Error in terms.formula(formula, data = data) : 
  invalid model formula in ExtractVars

We receive an error because we used quotations around the predictor variable names, which is not allowed in the formula.

How to Fix the Error

The way to fix this error is to simply remove the quotations around the variable names and write the formula as follows:

library(rpart)

#fit decision tree model
model <- rpart(rating ~ points + assists + rebounds, data = df)

#view summary of model
summary(model)

Call:
rpart(formula = rating ~ points + assists + rebounds, data = df)
  n= 8 

    CP nsplit rel error xerror xstd
1 0.01      0         1      0    0

Node number 1: 8 observations
  mean=86.75, MSE=55.1875 

We’re able to successfully fit the model without any errors because we removed the quotations from the predictor variables in the formula.

Additional Resources

The following tutorials explain how to fix other common errors in R:

How to Fix: the condition has length > 1 and only the first element will be used
How to Fix: non-numeric argument to binary operator
How to Fix: dim(X) must have a positive length
How to Fix: error in select unused arguments

Leave a Reply

Your email address will not be published.