When you fit a regression model using most statistical software, you’ll often notice the following two values in the output:
Multiple R: The multiple correlation coefficient between three or more variables.
R-Squared: This is calculated as (Multiple R)2 and it represents the proportion of the variance in the response variable of a regression model that can be explained by the predictor variables. This value ranges from 0 to 1.
In practice, we’re often interested in the R-squared value because it tells us how useful the predictor variables are at predicting the value of the response variable.
However, each time we add a new predictor variable to the model the R-squared is guaranteed to increase even if the predictor variable isn’t useful.
The adjusted R-squared is a modified version of R-squared that adjusts for the number of predictors in a regression model. It is calculated as:
Adjusted R2 = 1 – [(1-R2)*(n-1)/(n-k-1)]
- R2: The R2 of the model
- n: The number of observations
- k: The number of predictor variables
Since R-squared always increases as you add more predictors to a model, adjusted R-squared can serve as a metric that tells you how useful a model is, adjusted for the number of predictors in a model.
To gain a better understanding of each of these terms, consider the following example.
Example: Multiple R, R-Squared, & Adjusted R-Squared
Suppose we have the following dataset that contains the following three variables for 12 different students:
Suppose we fit a multiple linear regression model using Study Hours and Current Grade as the predictor variables and Exam Score as the response variable and get the following output:
We can observe the values for the following three metrics:
Multiple R: 0.978. This represents the multiple correlation between the response variable and the two predictor variables.
R Square: 0.956. This is calculated as (Multiple R)2 = (0.978)2 = 0.956. This tells us that 95.6% of the variation in exam scores can be explained by the number of hours spent studying by the student and their current grade in the course.
Adjusted R-Square: 0.946. This is calculated as:
Adjusted R2 = 1 – [(1-R2)*(n-1)/(n-k-1)] = 1 – [(1-.956)*(12-1)/(12-2-1)] = 0.946.
This represents the R-squared value, adjusted for the number of predictor variables in the model.
This metric would be useful if we, say, fit another regression model with 10 predictors and found that the Adjusted R-squared of that model was 0.88. This would indicate that the regression model with just two predictors is better because it has a higher adjusted R-squared value.