How to Calculate VIF in Excel


Multicollinearity in regression analysis occurs when two or more explanatory variables are highly correlated to each other, such that they do not provide unique or independent information in the regression model. If the degree of correlation is high enough between variables, it can cause problems when fitting and interpreting the regression model. 

Fortunately, it’s possible to detect multicollinearity using a metric known as the variance inflation factor (VIF), which measures the correlation and strength of correlation between the explanatory variables in a regression model.

This tutorial explains how to calculate VIF in Excel.

Example: Calculating VIF in Excel

For this example we will perform a multiple linear regression using the following dataset that describes the attributes of 10 basketball players. We will fit a regression model using rating as the response variable and points, assists, and rebounds as the explanatory variables. Then, we’ll identify the VIF values for each explanatory variable.

Raw data in Excel

Step 1: Perform a multiple linear regression.

Along the top ribbon, go to the Data tab and click on Data Analysis. If you don’t see this option, then you need to first install the free Analysis ToolPak.

Data Analysis Toolpak in Excel

Once you click on Data Analysis, a new window will pop up. Select Regression and click OK.

Regression with Data Analysis Toolpak in Excel

Fill in the necessary arrays for the response variables and the explanatory variables, then click OK.

Multiple regression in Excel

This produces the following output:

Regression output in Excel

Step 2: Calculate the VIF for each explanatory variable.

Next, we can calculate the VIF for each of the three explanatory variables by performing individual regressions using one explanatory variable as the response variable and the other two as the explanatory variables.

For example, we can calculate the VIF for the variable points by performing a multiple linear regression using points as the response variable and assists and rebounds as the explanatory variables.

This produces the following output:

Calculating VIF in Excel

The VIF for points is calculated as 1 / (1 – R Square) = 1 / (1 – .433099) = 1.76.

We can then repeat this process for the other two variables assists and rebounds.

It turns out that the VIF for the three explanatory variables are as follows:

points: 1.76
assists: 1.96
rebounds: 1.18

How to Interpret VIF Values

The value for VIF starts at 1 and has no upper limit. A general rule of thumb for interpreting VIFs is as follows:

  • A value of 1 indicates there is no correlation between a given explanatory variable and any other explanatory variables in the model.
  • A value between 1 and 5 indicates moderate correlation between a given explanatory variable and other explanatory variables in the model, but this is often not severe enough to require attention.
  • A value greater than 5 indicates potentially severe correlation between a given explanatory variable and other explanatory variables in the model. In this case, the coefficient estimates and p-values in the regression output are likely unreliable.

Given that each of the VIF values for the explanatory variables in our regression model are close to 1, multicollinearity is not a problem in our example.

Leave a Reply

Your email address will not be published.