Curvilinear regression is the name given to any regression model that attempts to fit a curve as opposed to a straight line.
Common examples of curvilinear regression models include:
Quadratic Regression: Used when a quadratic relationship exists between a predictor variable and a response variable. When graphed, this type of relationship looks like a “U” or an upside-down “U” on a scatterplot:
Cubic Regression: Used when a cubic relationship exists between a predictor variable and a response variable. When graphed, this type of relationship has two distinct curves on a scatterplot:
These are both in contrast to simple linear regression in which the relationship between the predictor variable and the response variable is linear:
The Formula of Curvilinear Regression Models
A simple linear regression model attempts to fit a dataset using the following formula:
ŷ = β0 + β1x
- ŷ: The response variable
- β0, β1: The regression coefficients
- x: The predictor variable
In contrast, a quadratic regression model uses the following formula:
ŷ = β0 + β1x + β2x2
And a cubic regression model uses the following formula:
ŷ = β0 + β1x + β2x2 + β3x3
A more general name given to regression models that include exponents is polynomial regression, which takes on the following formula:
ŷ = β0 + β1x + β2x2 + … + βkxk
The value for k indicates the degree of the polynomial. Although the degree can be any positive number, in practice we rarely fit polynomial regression models with a degree higher than 3 or 4.
By using exponents in the regression model formula, polynomial regression models are able to fit curves to datasets instead of straight lines.
When to Use Curvilinear Regression
The easiest way to know whether or not you should use curvilinear regression is to create a scatterplot of the predictor variable and response variable.
If the scatterplot displays a linear relationship between the two variables, then simple linear regression is likely appropriate to use.
However, if the scatterplot shows a quadratic, cubic, or some other curvilinear pattern between the predictor and response variable, then curvilinear regression is likely more appropriate to use.
You can also fit a simple linear regression model and a curvilinear regression model and compare the adjusted R-squared values of each model to determine which model offers a better fit to the data.
The adjusted R-squared is useful because it tells you the proportion of the variance in the response variable that can be explained by the predictor variable(s), adjusted for the number of predictor variables in the model.
In general, the model with the higher adjusted R-squared value offers a better fit to the dataset.
The following tutorials explain how to perform polynomial regression in different statistical software: