Linear regression is a method that we can use in statistics to summarize the relationship between one or more predictor variables and a response variable.

When using a linear regression model, we’re often interested in extracting the **fitted values** of the model, which are the values that the model predicts for the response value of each observation in a dataset.

To fit a linear regression model in R, we can use the **lm()** function.

Then, to extract the fitted values of the linear regression model we can use the **fitted.values** attribute.

The following example shows how to fit a linear regression model and then extract the fitted values of the model in R in practice.

**Example: How to Extract Fitted Values from Regression Model in R**

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame df <- data.frame(minutes=c(5, 10, 13, 14, 20, 22, 26, 34, 38, 40), fouls=c(5, 5, 3, 4, 2, 1, 3, 2, 1, 1), points=c(6, 8, 8, 7, 14, 10, 22, 24, 28, 30)) #view data frame df minutes fouls points 1 5 5 6 2 10 5 8 3 13 3 8 4 14 4 7 5 20 2 14 6 22 1 10 7 26 3 22 8 34 2 24 9 38 1 28 10 40 1 30

Suppose we would like to fit the following multiple linear regression model using minutes played and total fouls to predict the number of points scored by each player:

**points = β _{0} + β_{1}(minutes) + β_{2}(fouls)**

We can use the **lm()** function to fit this model:

#fit multiple linear regression model fit <- lm(points ~ minutes + fouls, data=df) #view summary of model summary(fit) Call: lm(formula = points ~ minutes + fouls, data = df) Residuals: Min 1Q Median 3Q Max -3.5241 -1.4782 0.5918 1.6073 2.0889 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -11.8949 4.5375 -2.621 0.0343 * minutes 0.9774 0.1086 9.000 4.26e-05 *** fouls 2.1838 0.8398 2.600 0.0354 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.148 on 7 degrees of freedom Multiple R-squared: 0.959, Adjusted R-squared: 0.9473 F-statistic: 81.93 on 2 and 7 DF, p-value: 1.392e-05

We can see that the p-values (from the **Pr(>|t|)** ) for both predictor variables **minutes** and **fouls** are less than .05, which means they are both significant predictors of total **points** scored for a particular player.

Suppose that we would like to extract all of the fitted values from the model.

We can use the **fitted.values** attribute to do so:

#extract fitted values from regression model into new column in original data frame df$fitted <- fit$fitted.values #view updated data frame df minutes fouls points fitted 1 5 5 6 3.911127 2 10 5 8 8.798214 3 13 3 8 7.362896 4 14 4 7 10.524098 5 20 2 14 12.021032 6 22 1 10 11.792082 7 26 3 22 20.069321 8 34 2 24 25.704875 9 38 1 28 27.430760 10 40 1 30 29.385594

The new column named **fitted** contains the fitted values from the regression model.

Note that these are the values that the model predicts for the value in the **points** column, using the coefficients from the regression model.

For example, we can see:

- The model predicts that the first player will scored
**3.91**points. In reality, this player scored**6**points. - The model predicts that the first player will scored
**8.79**points. In reality, this player scored**8**points. - The model predicts that the first player will scored
**7.36**points. In reality, this player scored**8**points.

And so on.

By comparing the values from the **fitted** and **points** columns, we can get an idea of how well the model was able to predict **points** using **minutes** and **fouls** as predictor variables.

**Additional Resources**

The following tutorials explain how to perform other common tasks in R:

How to Perform Simple Linear Regression in R

How to Perform Multiple Linear Regression in R

How to Perform Polynomial Regression in R

How to Create a Prediction Interval in R