**Multiple linear regression **is a method we can use to understand the relationship between two or more explanatory variables and a response variable.

This tutorial explains how to perform multiple linear regression in SPSS.

**Example: Multiple Linear Regression in SPSS**

Suppose we want to know if the number of hours spent studying and the number of prep exams taken affects the score that a student receives on a certain exam. To explore this, we can perform multiple linear regression using the following variables:

**Explanatory variables:**

- Hours studied
- Prep exams taken

**Response variable:**

- Exam score

Use the following steps to perform this multiple linear regression in SPSS.

**Step 1: Enter the data.**

Enter the following data for the number of hours studied, prep exams taken, and exam score received for 20 students:

**Step 2: Perform multiple linear regression.**

Click the **Analyze **tab, then **Regression**, then **Linear**:

Drag the variable **score **into the box labelled Dependent. Drag the variables **hours** and **prep_exams** into the box labelled Independent(s). Then click **OK**.

**Step 3: Interpret the output.**

Once you click **OK**, the results of the multiple linear regression will appear in a new window.

The first table we’re interested in is titled **Model Summary**:

Here is how to interpret the most relevant numbers in this table:

**R Square:**This is the proportion of the variance in the response variable that can be explained by the explanatory variables. In this example,**73.4%**of the variation in exam scores can be explained by hours studied and number of prep exams taken.**Std. Error of the Estimate:**The standard error is the average distance that the observed values fall from the regression line. In this example, the observed values fall an average of**5.3657**units from the regression line.

The next table we’re interested in is titled **ANOVA**:

Here is how to interpret the most relevant numbers in this table:

**F:**This is the overall F statistic for the regression model, calculated as Mean Square Regression / Mean Square Residual.**Sig:**This is the p-value associated with the overall F statistic. It tells us whether or not the regression model as a whole is statistically significant. In other words, it tells us if the two explanatory variables combined have a statistically significant association with the response variable. In this case the p-value is equal to 0.000, which indicates that the explanatory variables hours studied and prep exams taken have a statistically significant association with exam score.

The next table we’re interested in is titled **Coefficients**:

Here is how to interpret the most relevant numbers in this table:

**Unstandardized B (Constant):**This tells us the average value of the response variable when both predictor variables are zero. In this example, the average exam score is**67.674**when hours studied and prep exams taken are both equal to zero.**Unstandardized B (hours):**This tells us the average change in exam score associated with a one unit increase in hours studied, assuming number of prep exams taken is held constant. In this case, each additional hour spent studying is associated with an increase of**5.556**points in exam score, assuming the number of prep exams taken is held constant.**Unstandardized B (prep_exams):**This tells us the average change in exam score associated with a one unit increase in prep exams taken, assuming number of hours studied is held constant. In this case, each additional prep exam taken is associated with a decrease of**.602**points in exam score, assuming the number of hours studied is held constant.**Sig. (hours):**This is the p-value for the explanatory variable**hours**. Since this value (.000) is less than .05, we can conclude that hours studied has a statistically significant association with exam score.**Sig. (prep_exams):**This is the p-value for the explanatory variable**prep_exams**. Since this value (.519) is not less than .05, we cannot conclude that number of prep exams taken has a statistically significant association with exam score.

Lastly, we can form a regression equation using the values shown in the table for **constant**, **hours**, and **prep_exams**. In this case, the equation would be:

Estimated exam score = 67.674 + 5.556*(hours) – .602*(prep_exams)

We can use this equation to find the estimated exam score for a student, based on the number of hours they studied and the number of prep exams they took. For example, a student that studies for 3 hours and takes 2 prep exams is expected to receive an exam score of 83.1:

Estimated exam score = 67.674 + 5.556*(3) – .602*(2) = 83.1

**Note: **Since the explanatory variable **prep exams **was not found to be statistically significant, we may decide to remove it from the model and instead perform simple linear regression using **hours studied **as the only explanatory variable.