How to Extract Fitted Values in R


Linear regression is a method that we can use in statistics to summarize the relationship between one or more predictor variables and a response variable.

When using a linear regression model, we’re often interested in extracting the fitted values of the model, which are the values that the model predicts for the response value of each observation in a dataset.

To fit a linear regression model in R, we can use the lm() function.

Then, to extract the fitted values of the linear regression model we can use the fitted.values attribute.

The following example shows how to fit a linear regression model and then extract the fitted values of the model in R in practice.

Example: How to Extract Fitted Values from Regression Model in R

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame
df <- data.frame(minutes=c(5, 10, 13, 14, 20, 22, 26, 34, 38, 40),
                 fouls=c(5, 5, 3, 4, 2, 1, 3, 2, 1, 1),
                 points=c(6, 8, 8, 7, 14, 10, 22, 24, 28, 30))

#view data frame
df

   minutes fouls points
1        5     5      6
2       10     5      8
3       13     3      8
4       14     4      7
5       20     2     14
6       22     1     10
7       26     3     22
8       34     2     24
9       38     1     28
10      40     1     30

Suppose we would like to fit the following multiple linear regression model using minutes played and total fouls to predict the number of points scored by each player:

points = β0 + β1(minutes) + β2(fouls)

We can use the lm() function to fit this model:

#fit multiple linear regression model
fit <- lm(points ~ minutes + fouls, data=df)

#view summary of model
summary(fit)

Call:
lm(formula = points ~ minutes + fouls, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.5241 -1.4782  0.5918  1.6073  2.0889 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -11.8949     4.5375  -2.621   0.0343 *  
minutes       0.9774     0.1086   9.000 4.26e-05 ***
fouls         2.1838     0.8398   2.600   0.0354 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.148 on 7 degrees of freedom
Multiple R-squared:  0.959,	Adjusted R-squared:  0.9473 
F-statistic: 81.93 on 2 and 7 DF,  p-value: 1.392e-05

We can see that the p-values (from the Pr(>|t|) ) for both predictor variables minutes and fouls are less than .05, which means they are both significant predictors of total points scored for a particular player.

Suppose that we would like to extract all of the fitted values from the model.

We can use the fitted.values attribute to do so:

#extract fitted values from regression model into new column in original data frame
df$fitted <- fit$fitted.values

#view updated data frame
df

   minutes fouls points    fitted
1        5     5      6  3.911127
2       10     5      8  8.798214
3       13     3      8  7.362896
4       14     4      7 10.524098
5       20     2     14 12.021032
6       22     1     10 11.792082
7       26     3     22 20.069321
8       34     2     24 25.704875
9       38     1     28 27.430760
10      40     1     30 29.385594

The new column named fitted contains the fitted values from the regression model.

Note that these are the values that the model predicts for the value in the points column, using the coefficients from the regression model.

For example, we can see:

  • The model predicts that the first player will scored 3.91 points. In reality, this player scored 6 points.
  • The model predicts that the first player will scored 8.79points. In reality, this player scored 8 points.
  • The model predicts that the first player will scored 7.36 points. In reality, this player scored 8 points.

And so on.

By comparing the values from the fitted and points columns, we can get an idea of how well the model was able to predict points using minutes and fouls as predictor variables.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Perform Simple Linear Regression in R
How to Perform Multiple Linear Regression in R
How to Perform Polynomial Regression in R
How to Create a Prediction Interval in R

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *