MDAPE (median absolute percentage error) explained!

Median Absolute Percentage Error (MDAPE) is an error metric for regression machine learning models, but it’s not widely understood. In this post, I explain what MDAPE is, how to calculate it, and what a good value is.

What is MDAPE?

Median Absolute Percentage Error (MDAPE) is an error metric used to measure the performance of regression machine learning models. It is the median of all absolute percentage errors calculated between the predictions and their corresponding actual values. The resulting value is returned as a percentage which makes it easy to understand for end users.

MDAPE mathematical formula

The formula for calculating MDAPE is as follows:

mathematical formula for mdape (median absolute percentage error)

How is MDAPE calculated?

Let’s look at an example of how to calculate MDAPE for a regression model which is predicting the price of a house.

First, we will take the dataset and calculate the absolute error and corresponding absolute percentage error:

Actual Prediction Absolute Error Absolute Percentage Error
100,000 90,000 10,000 10%
200,000 210,000 10,000 5%
150,000 155,000 5,000 3.3%
180,000 178,000 2,000 1.1%
120,000 121,000 1,000 0.8%

Then to calculate MDAPE, we take the median of these five absolute percentage errors, which is:

10%, 5%, 3.3%, 1.1%, 0.8%
MDAPE = 3.3%

When to use MDAPE

MDAPE is a useful error metric, however, there are upsides and downsides to using it in your solution. These are:

Advantages of using MDAPE

  • Error is returned as a percentage, making it easy to understand
  • Possible to compare against other models as it’s returned as a percentage
  • Not as sensitive to outliers as MAPE due it being the median

Disadvantages of using MDAPE

MDAPE vs MAPE

The difference between MDAPE and MAPE is that MDAPE returns the median value of all the errors, whereas MAPE returns the mean. Because of this, MAPE is much more sensitive to outliers than MDAPE. So if removing the influence of outliers is important for your use case, then MDAPE would be best to use.

Calculate MDAPE in Python using Numpy

To calculate MDAPE in Python we need to use the Numpy package. An example of how this could be implemented is as follows:

import numpy as np

actual = [100,90,110,150]
predicted = [110,100,90,145]

mdape = np.median((np.abs(np.subtract(actual, predicted)/ actual))) * 100

Is MDAPE available in sklearn?

Unlike other popular metrics for machine learning models, MDAPE is not available through the scikit-learn package. Therefore it needs to be manually implemented using either Numpy or native Python functions.

What is a good MDAPE value?

MDAPE returns the error as a percentage ranging from zero to infinity where the lower the percentage the more accurate the model, and vice versa. What a good value depends upon your use case, but a general rule of thumb that I follow is:

MDAPE Interpretation
<10% Very good
10% - 20% Good
20% - 50% OK
>50% Not good

Can percent error be negative?

Regression metrics

MSE value
RMSE value
MAE score
R-Squared

Metric calculators

MAPE calculator
MAE calculator

References

Numpy documentation

Stephen Allwright

Stephen Allwright

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.
Oslo, Norway