The root mean square error (RMSE) is a metric that tells us how far apart our predicted values are from our observed values in a model, on average. It is calculated as:
RMSE = √[ Σ(Pi – Oi)2 / n ]
- Σ is a fancy symbol that means “sum”
- Pi is the predicted value for the ith observation
- Oi is the observed value for the ith observation
- n is the sample size
This tutorial explains a simple method to calculate RMSE in Python.
Example: Calculate RMSE in Python
Suppose we have the following arrays of actual and predicted values:
actual= [34, 37, 44, 47, 48, 48, 46, 43, 32, 27, 26, 24] pred = [37, 40, 46, 44, 46, 50, 45, 44, 34, 30, 22, 23]
To calculate the RMSE between the actual and predicted values, we can simply take the square root of the mean_squared_error() function from the sklearn.metrics library:
#import necessary libraries from sklearn.metrics import mean_squared_error from math import sqrt #calculate RMSE sqrt(mean_squared_error(actual, pred)) 2.4324199198
The RMSE turns out to be 2.4324.
How to Interpret RMSE
RMSE is a useful way to see how well a model is able to fit a dataset. The larger the RMSE, the larger the difference between the predicted and observed values, which means the worse a model fits the data. Conversely, the smaller the RMSE, the better a model is able to fit the data.
It can be particularly useful to compare the RMSE of two different models with each other to see which model fits the data better.