A **likelihood ratio test** compares the goodness of fit of two nested regression models.

A *nested* model is simply one that contains a subset of the predictor variables in the overall regression model.

For example, suppose we have the following regression model with four predictor variables:

Y = β_{0} + β_{1}x_{1} + β_{2}x_{2} + β_{3}x_{3} + β_{4}x_{4} + ε

One example of a nested model would be the following model with only two of the original predictor variables:

Y = β_{0} + β_{1}x_{1} + β_{2}x_{2} + ε

To determine if these two models are significantly different, we can perform a likelihood ratio test which uses the following null and alternative hypotheses:

**H _{0}:** The full model and the nested model fit the data equally well. Thus, you should

**use the nested model**.

**H _{A}:** The full model fits the data significantly better than the nested model. Thus, you should

**use the full model**.

If the p-value of the test is below a certain significance level (e.g. 0.05), then we can reject the null hypothesis and conclude that the full model offers a significantly better fit.

The following example shows how to perform a likelihood ratio test in R.

**Example: Likelihood Ratio Test in R**

The following code shows how to fit the following two regression models in R using data from the built-in **mtcars** dataset:

**Full model:** mpg = β_{0} + β_{1}disp + β_{2}carb + β_{3}hp + β_{4}cyl

**Reduced model:** mpg = β_{0} + β_{1}disp + β_{2}carb

We will use the **lrtest()** function from the **lmtest** package to perform a likelihood ratio test on these two models:

library(lmtest) #fit full model model_full <- lm(mpg ~ disp + carb + hp + cyl, data = mtcars) #fit reduced model model_reduced <- lm(mpg ~ disp + carb, data = mtcars) #perform likelihood ratio test for differences in models lrtest(model_full, model_reduced) Likelihood ratio test Model 1: mpg ~ disp + carb + hp + cyl Model 2: mpg ~ disp + carb #Df LogLik Df Chisq Pr(>Chisq) 1 6 -77.558 2 4 -78.603 -2 2.0902 0.3517

From the output we can see that the Chi-Squared test-statistic is **2.0902** and the corresponding p-value is **0.3517**.

Since this p-value is not less than .05, we will fail to reject the null hypothesis.

This means the full model and the nested model fit the data equally well. Thus, we should use the nested model because the additional predictor variables in the full model don’t offer a significant improvement in fit.

We could then carry out another likelihood ratio test to determine if a model with only one predictor variable is significantly different from a model with the two predictors:

library(lmtest) #fit full model model_full <- lm(mpg ~ disp + carb, data = mtcars) #fit reduced model model_reduced <- lm(mpg ~ disp, data = mtcars) #perform likelihood ratio test for differences in models lrtest(model_full, model_reduced) Likelihood ratio test Model 1: mpg ~ disp + carb Model 2: mpg ~ disp #Df LogLik Df Chisq Pr(>Chisq) 1 4 -78.603 2 3 -82.105 -1 7.0034 0.008136 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From the output we can see that the p-value of the likelihood ratio test is **0.008136**. Since this is less than .05, we would reject the null hypothesis.

Thus, we would conclude that the model with two predictors offers a significant improvement in fit over the model with just one predictor.

Thus, our final model would be:

mpg = β_{0} + β_{1}disp + β_{2}carb

**Additional Resources**

How to Perform Simple Linear Regression in R

How to Perform Multiple Linear Regression in R

How to Interpret Significance Codes in R