Often when we fit a linear regression model, we use **R-squared** as a way to assess how well a model fits the data.

R-squared represents the proportion of the variance in the response variable that can be explained by the predictor variables in a regression model.

This number ranges from 0 to 1, with higher values indicating a better model fit.

However, there is no such R-squared value for general linear models like logistic regression models and Poisson regression models.

Instead, we can calculate a metric known as **McFadden’s R-Squared**, which ranges from 0 to just under 1, with higher values indicating a better model fit.

We use the following formula to calculate McFadden’s R-Squared:

McFadden’s R-Squared = 1 – (log likelihood_{model} / log likelihood_{null})

where:

**log likelihood**: Log likelihood value of current fitted model_{model}**log likelihood**: Log likelihood value of null model (model with intercept only)_{null}

In practice, values over 0.40 indicate that a model fits the data very well.

The following example shows how to calculate McFadden’s R-Squared for a logistic regression model in R.

**Example: Calculating McFadden’s R-Squared in R**

For this example, we’ll use the **Default** dataset from the ISLR package. We can use the following code to load and view a summary of the dataset:

#install and load ISLR package install.packages('ISLR') library(ISLR) #define dataset data <- ISLR::Default #view summary of dataset summary(data) default student balance income No :9667 No :7056 Min. : 0.0 Min. : 772 Yes: 333 Yes:2944 1st Qu.: 481.7 1st Qu.:21340 Median : 823.6 Median :34553 Mean : 835.4 Mean :33517 3rd Qu.:1166.3 3rd Qu.:43808 Max. :2654.3 Max. :73554 #find total observations in dataset nrow(data) [1] 10000

This dataset contains the following information about 10,000 individuals:

**default:**Indicates whether or not an individual defaulted.**student:**Indicates whether or not an individual is a student.**balance:**Average balance carried by an individual.**income:**Income of the individual.

We will use student status, bank balance, and income to build a logistic regression model that predicts the probability that a given individual defaults:

#fit logistic regression model model <- glm(default~student+balance+income, family='binomial', data=data) #view model summary summary(model) Call: glm(formula = default ~ balance + student + income, family = "binomial", data = data) Deviance Residuals: Min 1Q Median 3Q Max -2.4691 -0.1418 -0.0557 -0.0203 3.7383 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.087e+01 4.923e-01 -22.080 < 2e-16 *** balance 5.737e-03 2.319e-04 24.738 < 2e-16 *** studentYes -6.468e-01 2.363e-01 -2.738 0.00619 ** income 3.033e-06 8.203e-06 0.370 0.71152 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 2920.6 on 9999 degrees of freedom Residual deviance: 1571.5 on 9996 degrees of freedom AIC: 1579.5 Number of Fisher Scoring iterations: 8

Next, we’ll use the following formula to calculate McFadden’s R-squared value for this model:

#calculate McFadden's R-squared for model with(summary(model), 1 - deviance/null.deviance) [1] 0.4619194

McFadden’s R-squared value turns out to be **0.4619194**. This value is fairly high, which indicates that our model fits the data well and has high predictive power.

Also note that we could use the **pR2()** function from the **pscl** package to calculate McFadden’s R-square value for the model as well:

#install and load pscl package install.packages('pscl') library(pscl) #calculate McFadden's R-squared for model pR2(model)['McFadden'] McFadden 0.4619194

Notice that this value matches the one calculated earlier.

**Additional Resources**

The following tutorials explain how to perform other common tasks in R:

How to Calculate R-Squared in R

How to Calculate Adjusted R-Squared in R

What is a Good R-squared Value?

Hi, I get slightly different values when I calculate McFadden’s R-squared using the first method compared to the second for the same glm output.

Method 1:

#calculate McFadden’s R-squared for model

with(summary(model), 1 – deviance/null.deviance)

Method 2:

#calculate McFadden’s R-squared for model

pR2(model)[‘McFadden’]

Any idea why?

Hi,

I’ve tried running both the pR2 and the with(summary(model) methods for 7 of my models and have got different results from the two methods for each model. Do you know why? My models are negative binomial and binomial GLMs.

Hi,

Thank you for this wonderful source. I just have one question.

Can I use the pR2() function you mentioned above for Poisson regression as well?

This is my model:

model <- glm (UC ~ Pasture + Rainfall + Temp + offset(log(Pop)), family = poisson(link = "log"), data = data)

I installed the package and just typed:

pR2(model)['McFadden']

and the result is McFadden: 0.6536465

Do you think this result is correct?

I would be grateful if you let me know. Thanks a lot.