One of the key assumptions in linear regression is that there is no correlation between the residuals, e.g. the residuals are independent.

To test for first-order autocorrelation, we can perform a Durbin-Watson test. However, if we’d like to test for autocorrelation at higher orders then we need to perform a **Breusch-Godfrey test**.

This test uses the following hypotheses:

**H _{0} (null hypothesis): **There is no autocorrelation at any order less than or equal to

*p*.

**H _{A} (alternative hypothesis): **There exists autocorrelation at some order less than or equal to

*p*.

The test statistic follows a Chi-Square distribution with *p* degrees of freedom.

If the p-value that corresponds to this test statistic is less than a certain significance level (e.g. 0.05) then we can reject the null hypothesis and conclude that autocorrelation exists among the residuals at some order less than or equal to *p*.

To perform a Breusch-Godfrey test in R, we can use the **bgtest(y ~ x, order = p)** function from the **lmtest** library.

This tutorial provides an example of how to use this syntax in R.

**Example: Breusch-Godfrey Test in R**

First, let’s create a fake dataset that contains two predictor variables (x1 and x2) and one response variable (y).

#create dataset df <- data.frame(x1=c(3, 4, 4, 5, 8, 9, 11, 13, 14, 16, 17, 20), x2=c(7, 7, 8, 8, 12, 4, 5, 15, 9, 17, 19, 19), y=c(24, 25, 25, 27, 29, 31, 34, 34, 39, 30, 40, 49)) #view first six rows of dataset head(df) x1 x2 y 1 3 7 24 2 4 7 25 3 4 8 25 4 5 8 27 5 8 12 29 6 9 4 31

Next, we can perform a Breusch-Godfrey test using the **bgtest() **function from the **lmtest **package.

For this example, we’ll test for autocorrelation among the residuals at order p =3:

#load lmtest package library(lmtest) #perform Breusch-Godfrey test bgtest(y ~ x1 + x2, order=3, data=df) Breusch-Godfrey test for serial correlation of order up to 3 data: y ~ x1 + x2 LM test = 8.7031, df = 3, p-value = 0.03351

From the output we can see that the test statistic is X^{2} = **8.7031 **with 3 degrees of freedom. The corresponding p-value is **0.03351**.

Since this p-value is less than 0.05, we can reject the null hypothesis and conclude that autocorrelation exists among the residuals at some order less than or equal to 3.

**How to Handle Autocorrelation**

If you reject the null hypothesis and conclude that autocorrelation is present in the residuals, then you have a few different options to correct this problem if you deem it to be serious enough:

- For positive serial correlation, consider adding lags of the dependent and/or independent variable to the model.
- For negative serial correlation, check to make sure that none of your variables are
*overdifferenced*. - For seasonal correlation, consider adding seasonal dummy variables to the model.

**Additional Resources**

How to Perform Simple Linear Regression in R

How to Perform Multiple Linear Regression in R

How to Perform a Durbin-Watson Test in R