One common warning you may encounter in R is:

**glm.fit: algorithm did not converge
**

This warning often occurs when you attempt to fit a logistic regression model in R and you experience **perfect separation** – that is, a predictor variable is able to perfectly separate the response variable into 0’s and 1’s.

The following example shows how to handle this warning in practice.

**How to Reproduce the Warning**

Suppose we attempt to fit the following logistic regression model in R:

#create data frame df <- data.frame(x=c(.1, .2, .3, .4, .5, .6, .7, .8, .9, 1, 1, 1.1, 1.3, 1.5, 1.7), y=c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)) #attempt to fit logistic regression model glm(y~x, data=df, family="binomial") Call: glm(formula = y ~ x, family = "binomial", data = df) Coefficients: (Intercept) x -409.1 431.1 Degrees of Freedom: 14 Total (i.e. Null); 13 Residual Null Deviance: 20.19 Residual Deviance: 2.468e-09 AIC: 4 Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred

Notice that we receive the warning message: **glm.fit: algorithm did not converge**.

We receive this message because the predictor variable x is able to perfectly separate the response variable y into 0’s and 1’s.

Notice that for every x value less than 1, y is equal to 0. And for every x value equal to or greater than 1, y is equal to 1.

The following code shows a scenario where the predictor variable is not able to perfectly separate the response variable into 0’s and 1’s:

#create data frame df <- data.frame(x=c(.1, .2, .3, .4, .5, .6, .7, .8, .9, 1, 1, 1.1, 1.3, 1.5, 1.7), y=c(0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)) #fit logistic regression model glm(y~x, data=df, family="binomial") Call: glm(formula = y ~ x, family = "binomial", data = df) Coefficients: (Intercept) x -2.112 2.886 Degrees of Freedom: 14 Total (i.e. Null); 13 Residual Null Deviance: 20.73 Residual Deviance: 16.31 AIC: 20.31

We don’t receive any warning message because the predictor variable is not able to perfectly separate the response variable into 0’s and 1’s.

**How to Handle the Warning**

If we encounter a scenario with perfect separation, there are two ways to handle it:

**Method 1: Use penalized regression.**

One option is to use some form of penalized logistic regression such as lasso logistic regression or elastic-net regularization.

Refer to the glmnet package for options on how to implement penalized logistic regression in R.

**Method 2: Use the predictor variable to perfectly predict the response variable.**

If you suspect that this perfect separation may exist in the population, you can simply use that predictor variable to perfectly predict the value of the response variable.

For example, in the above scenario we saw that the response variable **y** was always equal to 0 when the predictor variable **x** was less than 1.

If we suspect that this relationship holds in the overall population, we can just always predict that the value of **y** will be equal to 0 when **x** is less than 1 and not worry about fitting some penalized logistic regression model.

**Additional Resources**

The following tutorials offer additional information on working with the **glm()** function in R:

The Difference Between glm and lm in R

How to Use the predict function with glm in R

How to Handle: glm.fit: fitted probabilities numerically 0 or 1 occurred

hello, if you have GEE model and use the geepack and get this warning what should we do?