Median Absolute Percentage Error (MDAPE) is an error metric for regression machine learning models, but it’s not widely understood. In this post, I explain what MDAPE is, how to calculate it, and what a good value is.
What is MDAPE?
Median Absolute Percentage Error (MDAPE) is an error metric used to measure the performance of regression machine learning models. It is the median of all absolute percentage errors calculated between the predictions and their corresponding actual values. The resulting value is returned as a percentage which makes it easy to understand for end users.
MDAPE mathematical formula
The formula for calculating MDAPE is as follows:
How is MDAPE calculated?
Let’s look at an example of how to calculate MDAPE for a regression model which is predicting the price of a house.
First, we will take the dataset and calculate the absolute error and corresponding absolute percentage error:
|Actual||Prediction||Absolute Error||Absolute Percentage Error|
Then to calculate MDAPE, we take the median of these five absolute percentage errors, which is:
10%, 5%, 3.3%, 1.1%, 0.8%
MDAPE = 3.3%
When to use MDAPE
MDAPE is a useful error metric, however, there are upsides and downsides to using it in your solution. These are:
Advantages of using MDAPE
- Error is returned as a percentage, making it easy to understand
- Possible to compare against other models as it’s returned as a percentage
- Not as sensitive to outliers as MAPE due it being the median
Disadvantages of using MDAPE
- If your actual values can be zero or close to zero, then MDAPE won't be possible to calculate
MDAPE vs MAPE
The difference between MDAPE and MAPE is that MDAPE returns the median value of all the errors, whereas MAPE returns the mean. Because of this, MAPE is much more sensitive to outliers than MDAPE. So if removing the influence of outliers is important for your use case, then MDAPE would be best to use.
Calculate MDAPE in Python using Numpy
To calculate MDAPE in Python we need to use the Numpy package. An example of how this could be implemented is as follows:
import numpy as np actual = [100,90,110,150] predicted = [110,100,90,145] mdape = np.median((np.abs(np.subtract(actual, predicted)/ actual))) * 100
Is MDAPE available in sklearn?
Unlike other popular metrics for machine learning models, MDAPE is not available through the scikit-learn package. Therefore it needs to be manually implemented using either Numpy or native Python functions.
What is a good MDAPE value?
MDAPE returns the error as a percentage ranging from zero to infinity where the lower the percentage the more accurate the model, and vice versa. What a good value depends upon your use case, but a general rule of thumb that I follow is:
|10% - 20%||Good|
|20% - 50%||OK|
Fit Predict Newsletter
The simple weekly roundup of all the latest news, tools, packages, and use cases from the world of Data Science 📥