# How to Interpret Log-Likelihood Values (With Examples)

The log-likelihood value of a regression model is a way to measure the goodness of fit for a model. The higher the value of the log-likelihood, the better a model fits a dataset.

The log-likelihood value for a given model can range from negative infinity to positive infinity. The actual log-likelihood value for a given model is mostly meaningless, but it’s useful for comparing two or more models.

In practice, we often fit several regression models to a dataset and choose the model with the highest log-likelihood value as the model that fits the data best.

The following example shows how to interpret log-likelihood values for different regression models in practice.

## Example: Interpreting Log-Likelihood Values

Suppose we have the following dataset that shows the number of bedrooms, number of bathrooms, and selling price of 20 different houses in a particular neighborhood:

Suppose we’d like to fit the following two regression models and determine which one offers a better fit to the data:

Model 1: Price = β0 + β1(number of bedrooms)

Model 2: Price = β0 + β1(number of bathrooms)

The following code shows how to fit each regression model and calculate the log-likelihood value of each model in R:

```#define data
df <- data.frame(beds=c(1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 4, 4, 4, 5, 5, 6),
baths=c(2, 1, 4, 3, 2, 2, 3, 5, 4, 3,
4, 4, 3, 4, 2, 4, 3, 5, 6, 7),
price=c(120, 133, 139, 185, 148, 160, 192, 205, 244, 213,
236, 280, 275, 273, 312, 311, 304, 415, 396, 488))

#fit models
model1 <- lm(price~beds, data=df)
model2 <- lm(price~baths, data=df)

#calculate log-likelihood value of each model
logLik(model1)

'log Lik.' -91.04219 (df=3)

logLik(model2)

'log Lik.' -111.7511 (df=3)
```

The first model has a higher log-likelihood value (-91.04) than the second model (-111.75), which means the first model offers a better fit to the data.

## Cautions on Using Log-Likelihood Values

When calculating log-likelihood values, it’s important to note that adding more predictor variables to a model will almost always increase the log-likelihood value even if the additional predictor variables aren’t statistically significant.

This means you should only compare the log-likelihood values between two regression models if each model has the same number of predictor variables.

To compare models with different numbers of predictor variables, you can perform a likelihood-ratio test to compare the goodness of fit of two nested regression models.

The following tutorials explain how to perform other common tasks in R:

May 13, 2024
April 25, 2024
April 19, 2024
April 18, 2024
April 18, 2024

## 4 Replies to “How to Interpret Log-Likelihood Values (With Examples)”

1. Santo Samuel Surja says:

Any help would be greatly appreciated. Thankyou in advance.

2. Obama says:

Or you can use Bayesian information criterion, which is the same thing you described but it takes into account the number of predictor variables (parameters)

BIC = -2 * LL + log(N) * k
Where log() has the base-e called the natural logarithm, LL is the log-likelihood of the model, N is the number of examples in the training dataset, and k is the number of parameters in the model.
The score as defined above is minimized, e.g. the model with the lowest BIC is selected.

3. Imran Thobani says:

Because the likelihood of any given data point is at most 1, the log-likelihood of any given data point is at most 0, so the log-likelihood can only range from negative infinity to 0.

4. the less smart zach says:

sick thanks zach.