RMSE is a common regression machine learning metric, but it can be confusing to know how to interpret the values. In this post, I explain what RMSE is, how to interpret the values and walk through an example.
What is RMSE
Root Mean Squared Error (RMSE) is the square root of the mean squared error between the predicted and actual values.
Squared error, also known as L2 loss, is a row-level error calculation where the difference between the prediction and the actual is squared. RMSE is the aggregated mean and subsequent square root of these errors, which helps us understand the model performance over the whole dataset.
A benefit of using RMSE is that the metric it produces is on the same scale as the unit being predicted. For example, calculating RMSE for a house price prediction model would give the error in terms of house price, which can help end users easily understand model performance.
RMSE mathematical formula
The formula for calculating RMSE is:
How to interpret RMSE
RMSE is a weighted measure of model accuracy given on the same scale as the prediction target. Simply put, RMSE can be interpreted as the average error that the model’s predictions have in comparison with the actual, with extra weight added to larger prediction errors.
RMSE value interpretation
The closer RMSE is to 0, the more accurate the model is. But RMSE is returned on the same scale as the target you are predicting for and therefore there isn’t a general rule for how to interpret ranges of values. The interpretation of your value can only be evaluated within your dataset.
Let’s try to unpack this more by looking at an example.
An RMSE of 1,000 for a house price prediction model is most likely seen as good because house prices tend to be over $100,000. However, the same RMSE of 1,000 for a height prediction model is terrible as the average height is around 175cm.
RMSE interpretation example
Let’s use our understanding from the previous sections to walk through an example. I will be calculating the RMSE and subsequent interpretation for an example where we want to predict people’s height.
|Predicted height||Actual height||Squared difference|
The RMSE for these predictions is:
RMSE = 9.55
The interpretation of this value is:
The weighted average error between the predictions and actuals in this dataset is 9.55, which is likely a good value given that the average actual height in the dataset is 170.
Fit Predict Newsletter
The simple weekly roundup of all the latest news, tools, packages, and use cases from the world of Data Science 📥