How to interpret R Squared (simply explained)

R Squared is a common regression machine learning metric, but it can be confusing to know how to interpret the values. In this post, I explain what R Squared is, how to interpret the values and walk through an example.

What is R Squared

R Squared (also known as R2) is a metric for assessing the performance of regression machine learning models. Unlike other metrics, such as MAE or RMSE, it is not a measure of how accurate the predictions are, but instead a measure of fit. R Squared measures how much of the dependent variable variation is explained by the independent variables in the model.

R Squared mathematical formula

The formula for calculating R Squared is as follows:

r squared mathematical formula

How to interpret R Squared

R Squared can be interpreted as the percentage of the dependent variable variance which is explained by the independent variables. Put simply, it measures the extent to which the model features can be used to explain the model target.

For example, an R Squared value of 0.9 would imply that 90% of the target variance can be explained by the model features, whilst a value of 0.2 would suggest that the model features are only able to account for 20% of the variance.

R Squared valued interpretation

Now that we understand how to interpret the meaning of R Squared, let’s look at how to interpret the different values that it can produce. This will be dependent upon your use case and dataset, but a general rule that I follow is:

R Squared value Interpretation
0.75 - 1 Significant amount of variance explained
0.5 - 0.75 Good amount of variance explained
0.25 - 0.5 Small amount of variance explained
0 - 0.25 Little to no variance explained

R Squared interpretation example

Let’s use our understanding from the previous sections to walk through an example. I will be calculating the R Squared value and subsequent interpretation for an example where we want to understand how much of the Height variance can be explained by Shoe Size.

Shoe size Height
10 180
6 160
9 170
4 150
11 200
13 190
4 140

The R Squared value for these predictions is:

R Squared = 0.88

The interpretation of this value is:

88% of the variance in Height is explained by Shoe Size, which is commonly seen as a significant amount of the variance being explained.


Regression metrics

What is the interpretation of MAPE?
What is the interpretation of RMSE?
What is the interpretation of MSE?
What is the interpretation of MAE?

Metric calculators

R Squared calculator
Coefficient of determination calculator

References

R2 scikit-learn documentation

Stephen Allwright

Stephen Allwright

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.
Oslo, Norway