One of the most common metrics used to measure the prediction accuracy of a model is MSE, which stands for mean squared error. It is calculated as:
MSE = (1/n) * Σ(actual – prediction)2
where:
- Σ – a fancy symbol that means “sum”
- n – sample size
- actual – the actual data value
- prediction – the predicted data value
The lower the value for MSE, the more accurately a model is able to predict values.
How to Calculate MSE in R
Depending on what format your data is in, there are two easy methods you can use to calculate the MSE of a regression model in R.
Method 1: Calculate MSE from Regression Model
In one scenario, you may have a fitted regression model and would simply like to calculate the MSE of the model. For example, you may have the following regression model:
#load mtcars dataset data(mtcars) #fit regression model model <- lm(mpg~disp+hp, data=mtcars) #get model summary model_summ <-summary(model)
To calculate the MSE for this model, you can use the following formula:
#calculate MSE
mean(model_summ$residuals^2)
[1] 8.85917
This tells us that the MSE is 8.85917.
Method 2: Calculate MSE from a list of Predicted and Actual Values
In another scenario, you may simply have a list of predicted and actual values. For example:
#create data frame with a column of actual values and a column of predicted values data <- data.frame(pred = predict(model), actual = mtcars$mpg) #view first six lines of data head(data) pred actual Mazda RX4 23.14809 21.0 Mazda RX4 Wag 23.14809 21.0 Datsun 710 25.14838 22.8 Hornet 4 Drive 20.17416 21.4 Hornet Sportabout 15.46423 18.7 Valiant 21.29978 18.1
In this case, you can use the following formula to calculate the MSE:
#calculate MSE
mean((data$actual - data$pred)^2)
[1] 8.85917
This tells us that the MSE is 8.85917, which matches the MSE that we calculated using the previous method.