When performing regression analysis, we’re often interested in understanding how changes in an independent variable affect a dependent variable. However, sometimes a moderating variable can affect this relationship.
For example, suppose we want to fit a regression model in which we use the independent variable hours spent exercising each week to predict the dependent variable resting heart rate.
We suspect that more hours spent exercising is associated with a lower resting heart rate. However, this relationship could be affected by a moderating variable such as gender.
It’s possible that each extra hour of exercise causes resting heart rate to drop more for men compared to women.
Another example of a moderating variable could be age. It’s likely that each extra hour of exercise causes resting heart rate to drop more for younger people compared to older people.
Properties of Moderating Variables
Moderating variables have the following properties:
1. Moderating variables can be qualitative or quantitative.
Qualitative variables are variables that take on names or labels. Examples include:
- Gender (Male or Female)
- Education Level (High School Degree, Bachelor’s Degree, Master’s Degree, etc.)
- Marital Status (Single, Married, Divorced)
Quantitative variables are variables that take on numerical values. Examples include:
- Square Footage
- Population Size
In the previous examples, gender was a qualitative variable that could affect the relationship between hours studied and resting heart rate while age was a quantitative variable that could potentially affect the relationship.
2. Moderating variables can affect the relationship between an independent and dependent variable in a variety of ways.
Moderating variables can have the following effects:
- Strengthen the relationship between two variables.
- Weaken the relationship between two variables.
- Negate the relationship between two variables.
Depending on the situation, a moderating variable can moderate the relationship between two variables in many different ways.
How to Test for Moderating Variables
If X is an independent variable (sometimes called a “predictor” variable) and Y is a dependent variable (sometimes called a “response” variable), then we could write a regression equation to describe the relationship between the two variables as follows:
Y = β0 + β1X
If we suspect that some other variable, Z, is a moderator variable, then we could fit the following regression model:
Y = β0 + β1X1+ β2Z + β3XZ
In this equation, the term XZ is known as an interaction term.
If the p-value for the coefficient of XZ in the regression output is statistically significant, then this indicates that there is a significant interaction between X and Z and Z should be included in the regression model as a moderator variable.
We would write the final model as:
Y = β0 + β1X+ β2Z + β3XZ
If the p-value for the coefficient of XZ in the regression output is not statistically significant, then Z is not a moderator variable.
However it’s possible that the coefficient for Z could still be statistically significant. In this case, we would simply include Z as another independent variable in the regression model.
We would then write the final model as:
Y = β0 + β1X+ β2Z