MAPE (Mean Absolute Percentage Error) is a common regression machine learning metric, but when the actual values are close to 0 it becomes undefined. In this post, I explain why this happens and what to do when it does.
What is Mean Absolute Percentage Error?
MAPE (Mean Absolute Percentage Error) is the mean of all absolute percentage errors between the predicted and actual values.
Absolute percentage error is a row-level error calculation where the non-negative difference between the prediction and the actual is divided by the actual value to return the error as a relative percentage. MAPE is the aggregated mean of these errors, which helps us understand the model performance over the whole dataset.
MAPE is a popular metric to use as the error value is easily interpreted and comparable across datasets.
MAPE metric definition
The mathematical formula for calculating MAPE is:
How to define MAPE when actual is 0
When actual values are at or close to 0, MAPE is not defined. This is because in order to calculate MAPE we need to be able to divide by the actual value. Therefore when this actual value is close to 0 MAPE becomes undefined as you will receive either infinity or a division by zero error.
Why does MAPE receive an inf error?
The inf error is returned because there are actual values in the dataset which are either 0 or close to 0, which causes MAPE to become infinite as there will be a division by 0.
The cause of this infinite error becomes clear if we look at an example:
|Actual||Predicted||Absolute Error||Absolute Percentage Error|
|0||1||1||inf (div by 0)|
The fourth row in the dataset has an actual value of 0. When we calculate the absolute percentage error for this row, we get infinity as there is a division by 0:
|0 - 1| / 0 = inf
Therefore when we come to calculate the mean of all the Absolute Percentage Errors for our MAPE metric, we get the following:
MAPE = (0.2 + 0.5 + 0.125 + inf) / 4 = inf
How to calculate MAPE with zero values
When your dataset has actual values around 0, using MAPE is not possible. So, now that you can't use MAPE, what should you do? Well, here are your options when calculating MAPE with zero values:
- Use SMAPE. This metric is similar to MAPE and still returns the error as a percentage. It’s a common alternative to MAPE when actuals are close to 0
- Use RMSE or MAE. These regression metrics are common to use and are definable when actual values are 0
One approach which is not recommended is to change your actual values to make MAPE definable again, for example by adding 1 to each value. This will have a negative effect on your modelling and validation. It is much better to change the metric you are using instead of manually adjusting the target.