# How to Interpret a Curved Residual Plot (With Example)

Residual plots are used to assess whether or not the residuals in a regression model are normally distributed and whether or not they exhibit heteroscedasticity.

Ideally, you would like the points in a residual plot to be randomly scattered around a value of zero with no clear pattern.

If you encounter a residual plot where the points in the plot have a curved pattern, it likely means that the regression model you have specified for the data is not correct.

In most cases, it means that you attempted to fit a linear regression model to a dataset that instead follows a quadratic trend.

The following example shows how to interpret (and fix) a curved residual plot in practice.

## Example: Interpreting a Curved Residual Plot

Suppose we collect the following data on the number of hours worked per week and the reported happiness level (on a scale of 0-100) for 11 different people in some office: If we create a simple scatter plot of hours worked vs. happiness level, here’s what it would look like: Now suppose we would like to fit a regression model using hours worked to predict happiness level.

The following code shows how fit a simple linear regression model to this dataset and produce a residual plot in R:

```#create dataframe
df <- data.frame(hours=c(6, 9, 12, 14, 30, 35, 40, 47, 51, 55, 60),
happiness=c(14, 28, 50, 70, 89, 94, 90, 75, 59, 44, 27))
#fit linear regression model
linear_model <- lm(happiness ~ hours, data=df)

#get list of residuals
res <- resid(linear_model)

#produce residual vs. fitted plot
plot(fitted(linear_model), res, xlab='Fitted Values', ylab='Residuals')

#add a horizontal line at 0
abline(0,0)
``` The x-axis displays the fitted values and the y-axis displays the residuals.

From the plot we can see that there is a curved pattern in the residuals, which indicates that a linear regression model does not provide an appropriate fit to this dataset.

The following code shows how fit a quadratic regression model to this dataset and produce a residual plot in R:

```#create dataframe
df <- data.frame(hours=c(6, 9, 12, 14, 30, 35, 40, 47, 51, 55, 60),
happiness=c(14, 28, 50, 70, 89, 94, 90, 75, 59, 44, 27))
#define quadratic term to use in model
df\$hours2 <- df\$hours^2

quadratic_model <- lm(happiness ~ hours + hours2, data=df)

#get list of residuals

#produce residual vs. fitted plot

#add a horizontal line at 0
abline(0,0)``` Once again the x-axis displays the fitted values and the y-axis displays the residuals.

From the plot we can see that the residuals are randomly scattered around zero and there is no clear pattern in the residuals.

This tells us that a quadratic regression model does a much better job of fitting this dataset compared to a linear regression model.

This should make sense considering we saw that the true relationship between hours worked and happiness level appeared to be quadratic instead of linear.