Often you may want to check for outlier observations in a linear regression model.

This is important to do because outliers can affect the overall fit of the model and can cause problems when attempting to use the model to make predictions for the response values of unseen observations.

One common way to check for outliers in a regression model is to use the Bonferroni outlier test, which reports p-values for each observation in the dataset and give us an idea of which observations could potentially be outliers.

The easiest way to perform the Bonferroni outlier test in R is by using the **outlierTest()** function from the **car** package, which can be used to perform this exact task.

The **outlierTest****()** function uses the following syntax:

**outlierTest(model, cutoff=.05, …)
**

where:

**model:**A linear regression model fit using the lm() function**cutoff:**Observations with Bonferroni p-values exceeding this value are not reported, unless no observations are nominated, in which case the one with the largest Studentized residual is reported

Note that you can adjust the cutoff value if you would like to change the requirement for what is considered to be an outlier. The default value is **.05**.

The following example shows how to use the **outlierTest****()** function in practice in R.

**Example: How to Perform a Bonferroni Outlier Test in R**

For this particular example we will fit a multiple linear regression model using the built-in mtcars dataset in R.

We can use the **head()** function to view the first few rows from this dataset:

**#view head of mtcars dataset
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1**

The dataset contains various measurements for different cars.

Suppose that we would like to fit a multiple linear regression model using **disp** and **carb** as the predictor variables to predict the value of **mpg** (miles per gallon) of each car in the dataset.

We can use the following syntax to fit this regression model and view the model summary:

**#fit first regression model
fit <- lm(mpg ~ disp + carb, data = mtcars)
#view model summary
summary(fit)
Call:
lm(formula = mpg ~ disp + carb, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.3379 -2.0849 -0.3448 1.5118 6.2836
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 31.152710 1.263620 24.654 < 2e-16 ***
disp -0.036296 0.004676 -7.762 1.47e-08 ***
carb -0.955677 0.358789 -2.664 0.0125 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.964 on 29 degrees of freedom
Multiple R-squared: 0.7737, Adjusted R-squared: 0.7581
F-statistic: 49.58 on 2 and 29 DF, p-value: 4.393e-10
**

Now suppose that we would like to perform a Bonferroni outlier test to check if any of the observations in the original dataset are considered to be outliers when used in the regression model.

We can use the following syntax with the **outlierTest()** function do so:

**library(car)
#perform Bonferroni outlier test
outlierTest(fit)
No Studentized residuals with Bonferroni p < 0.05
Largest |rstudent|:
rstudent unadjusted p-value Bonferroni p
Toyota Corolla 2.411735 0.022681 0.72579**

The output tells us that there are **No Studentized residuals with Bonferroni p < 0.05**.

This tells us that there are no outliers in this regression model.

The **outlierTest()** function then returns the observation with the highest studentized residual.

**Additional Resources**

The following tutorials explain how to perform other common tasks in R:

How to Sort a Table in R

How to Plot a Table in R

How to Create a Three-Way Table in R