One error message you may encounter in R is:

Coefficients: (1 not defined because of singularities)

This error message occurs when you fit some model using the **glm()** function in R and two or more of your predictor variables have an exact linear relationship between them – known as perfect multicollinearity.

To fix this error, you can use the **cor()** function to identify which variables in your dataset have a perfect correlation with each other and simply drop one of those variables from the regression model.

This tutorial shares how to address this error message in practice.

**How to Reproduce the Error**

Suppose we fit a logistic regression model to the following data frame in R:

**#define data
df <- data.frame(y = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1),
x1 = c(3, 3, 4, 4, 3, 2, 5, 8, 9, 9, 9, 8, 9, 9, 9),
x2 = c(6, 6, 8, 8, 6, 4, 10, 16, 18, 18, 18, 16, 18, 18, 18),
x3 = c(4, 7, 7, 3, 8, 9, 9, 8, 7, 8, 9, 4, 9, 10, 13))
#fit logistic regression model
model <- glm(y~x1+x2+x3, data=df, family=binomial)
#view model summary
summary(model)
Call:
glm(formula = y ~ x1 + x2 + x3, family = binomial, data = df)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.372e-05 -2.110e-08 2.110e-08 2.110e-08 1.575e-05
Coefficients: (1 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -75.496 176487.031 0.000 1
x1 14.546 24314.459 0.001 1
x2 NA NA NA NA
x3 -2.258 20119.863 0.000 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2.0728e+01 on 14 degrees of freedom
Residual deviance: 5.1523e-10 on 12 degrees of freedom
AIC: 6
Number of Fisher Scoring iterations: 24
**

Notice that right before the coefficient output, we receive the message:

**Coefficients: (1 not defined because of singularities)
**

This indicates that two or more predictor variables in the model have a perfect linear relationship and thus not every regression coefficient in the model can be estimated.

For example, notice that no coefficient estimate can be made for the **x _{2}** predictor variable.

**How to Handle the Error**

To identify which predictor variables are causing this error, we can use the **cor()** function to produce a correlation matrix and examine which variables have a correlation of exactly **1** with each other:

**#create correlation matrix
cor(df)
y x1 x2 x3
y 1.0000000 0.9675325 0.9675325 0.3610320
x1 0.9675325 1.0000000 1.0000000 0.3872889
x2 0.9675325 1.0000000 1.0000000 0.3872889
x3 0.3610320 0.3872889 0.3872889 1.0000000
**

From the correlation matrix we can see that the variables **x _{1}** and

**x**are perfectly correlated.

_{2}To resolve this error, we can simply drop one of those two variables from the model since they don’t actually provide unique or independent information in the regression model.

For example, suppose we drop **x _{2}** and fit the following logistic regression model:

**#fit logistic regression model
model <- glm(y~x1+x3, data=df, family=binomial)
#view model summary
summary(model)
Call:
glm(formula = y ~ x1 + x3, family = binomial, data = df)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.372e-05 -2.110e-08 2.110e-08 2.110e-08 1.575e-05
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -75.496 176487.031 0.000 1
x1 14.546 24314.459 0.001 1
x3 -2.258 20119.863 0.000 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2.0728e+01 on 14 degrees of freedom
Residual deviance: 5.1523e-10 on 12 degrees of freedom
AIC: 6
Number of Fisher Scoring iterations: 24**

Notice that we don’t receive a “not defined because of singularities” error message this time.

**Note**: It doesn’t matter whether we drop x_{1} or x_{2}. The final model will contain the same coefficient estimate for whichever variable you decide to keep and the overall goodness of fit of the model will be the same.

**Additional Resources**

The following tutorials explain how to handle other errors in R:

How to Fix in R: invalid model formula in ExtractVars

How to Fix in R: argument is not numeric or logical: returning na

How to Fix: randomForest.default(m, y, …) : Na/NaN/Inf in foreign function call