How to Perform Quadratic Regression in Stata


When two variables have a linear relationship, you can often use simple linear regression to quantify their relationship.

Example of linear relationship

However, when to variables have a quadratic relationship, you can instead use quadratic regression to quantify their relationship.

Example of quadratic relationship

This tutorial explains how to perform quadratic regression in Stata.

Example: Quadratic Regression in Stata

Suppose we are interested in understanding the relationship between number of hours worked and happiness. We have the following data on the number of hours worked per week and the reported happiness level (on a scale of 0-100) for 16 different people:

Quadratic regression dataset in Stata

You can replicate this example by typing in this exact data into Stata using Data > Data Editor > Data Editor (Edit) along the top menu.

Use the following steps to perform a quadratic regression in Stata.

Step 1: Visualize the data.

Before we can use quadratic regression, we need to make sure that the relationship between the explanatory variable (hours) and response variable (happiness) is actually quadratic. So, let’s visualize the data using a scatterplot by typing the following into the Command box:

scatter happiness hours

This produces the following scatterplot:

Quadratic scatterplot in Stata

We can see that happiness tends to increase as number of hours worked increases from zero up to a certain point, but then begins to drop lower as the number of hours worked exceeds about 30.

This upside down “U” shape in the scatterplot indicates that there is a quadratic relationship between hours worked and happiness, which means we should use quadratic regression to quantify this relationship.

Step 2: Perform quadratic regression.

Before we fit the quadratic regression model to the data, we need to create a new variable for the squared values of our predictor variable hours. We can do so by typing the following into the Command box:

gen hours2 = hours*hours

We can view this new variable by going to Data > Data Editor > Data Editor (Browse) along the top menu.

Quadratic regression in Stata

We can see that hours2 is simply hours squared. Now we can perform quadratic regression using hours and hours2 as our explanatory variables and happiness as our response variable. To perform quadratic regression, type the following into the Command box:

regress happiness hours hours2

Quadratic regression output in Stata

Here is how to interpret the most interesting numbers in the output:

Prob > F: 0.000. This is the p-value for the overall regression. Since this value is less than 0.05, it means that the predictor variables hours and hours2 combined have a statistically significant relationship with the response variable happiness.

R-squared: 0.9092. This is the proportion of the variance in the response variable that can be explained by the explanatory variable. In this example, 90.92% of the variation in happiness can be explained by hours and hours2.

Regression Equation: We can form a regression equation using the coefficient values reported in the output table. In this case, the equation would be:

predicted happiness  = -30.25287 + 7.173061(hours) – .1069887(hours2)

We can use this equation to find the predicted happiness of an individual, given the number of hours they work per week.

For example, an individual that works 60 hours per week is predicted to have a happiness level of 14.97:

predicted happiness  = -30.25287 + 7.173061(60) – .1069887(602) = 14.97.

Conversely, an individual that works 30 hours perk week is predicted to have a happiness level of 88.65:

predicted happiness  = -30.25287 + 7.173061(30) – .1069887(302) = 88.65.

Step 3: Report the results.

Lastly, we want to report the results of our quadratic regression. Here is an example of how to do so:

A quadratic regression was performed to quantify the relationship between the number of hours worked by an individual and their corresponding happiness level (measured from 0 to 100). A sample of 16 individuals was used in the analysis.

 

Results showed that there was a statistically significant relationship between the explanatory variables hours and hours2 and the response variable happiness (F(2, 13) = 65.09, p < 0.0001).

 

Combined, these two explanatory variables accounted for 90.92% of explained variability in happiness. 

 

The regression equation was found to be:

 

predicted happiness  = -30.25287 + 7.173061(hours) – .1069887(hours2)

Leave a Reply

Your email address will not be published. Required fields are marked *