What is MDAPE and how do I calculate it in Python?

MDAPE (median absolute percentage error) explained!

Median Absolute Percentage Error (MDAPE) is an error metric for regression machine learning models, but it’s not widely understood. In this post, I explain what MDAPE is, how to calculate it, and what a good value is.

Stephen Allwright

12 Aug 2022

What is MDAPE?

Median Absolute Percentage Error (MDAPE) is an error metric used to measure the performance of regression machine learning models. It is the median of all absolute percentage errors calculated between the predictions and their corresponding actual values. The resulting value is returned as a percentage which makes it easy to understand for end users.

MDAPE mathematical formula

The formula for calculating MDAPE is as follows:

mathematical formula for mdape (median absolute percentage error)

How is MDAPE calculated?

Let’s look at an example of how to calculate MDAPE for a regression model which is predicting the price of a house.

First, we will take the dataset and calculate the absolute error and corresponding absolute percentage error:

Actual	Prediction	Absolute Error	Absolute Percentage Error
100,000	90,000	10,000	10%
200,000	210,000	10,000	5%
150,000	155,000	5,000	3.3%
180,000	178,000	2,000	1.1%
120,000	121,000	1,000	0.8%

Then to calculate MDAPE, we take the median of these five absolute percentage errors, which is:

10%, 5%, 3.3%, 1.1%, 0.8%
MDAPE = 3.3%

When to use MDAPE

MDAPE is a useful error metric, however, there are upsides and downsides to using it in your solution. These are:

Advantages of using MDAPE

Error is returned as a percentage, making it easy to understand
Possible to compare against other models as it’s returned as a percentage
Not as sensitive to outliers as MAPE due it being the median

Disadvantages of using MDAPE

If your actual values can be zero or close to zero, then MDAPE won't be possible to calculate

MDAPE vs MAPE

The difference between MDAPE and MAPE is that MDAPE returns the median value of all the errors, whereas MAPE returns the mean. Because of this, MAPE is much more sensitive to outliers than MDAPE. So if removing the influence of outliers is important for your use case, then MDAPE would be best to use.

Calculate MDAPE in Python using Numpy

To calculate MDAPE in Python we need to use the Numpy package. An example of how this could be implemented is as follows:

import numpy as np

actual = [100,90,110,150]
predicted = [110,100,90,145]

mdape = np.median((np.abs(np.subtract(actual, predicted)/ actual))) * 100

Is MDAPE available in sklearn?

Unlike other popular metrics for machine learning models, MDAPE is not available through the scikit-learn package. Therefore it needs to be manually implemented using either Numpy or native Python functions.

What is a good MDAPE value?

MDAPE returns the error as a percentage ranging from zero to infinity where the lower the percentage the more accurate the model, and vice versa. What a good value depends upon your use case, but a general rule of thumb that I follow is:

MDAPE	Interpretation
<10%	Very good
10% - 20%	Good
20% - 50%	OK
>50%	Not good

Can percent error be negative?

Regression metrics

MSE value
RMSE value
MAE score
R-Squared

Metric calculators

MAPE calculator
MAE calculator

References

Numpy documentation

Metrics

Stephen Allwright Twitter

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.