L1 loss function, explained

L1 loss is a loss function commonly used in machine learning. In this post I will explain what it is, how to implement it in Python, and some common questions that users have.

Stephen Allwright

11 Jun 2022

L1 loss is a loss function commonly used in machine learning. In this post I will explain what it is, how to implement it in Python, and some common questions that users have.

L1 loss function, what is it?

L1 loss, also known as Absolute Error Loss, is the absolute difference between a prediction and the actual value, calculated for each example in a dataset. The aggregation of all these loss values is called the cost function, where the cost function for L1 is commonly MAE (Mean Absolute Error).

L1 loss function formula

The mathematical formula for calculating l1 loss is:

L1 loss function example

Let’s say we are predicting house prices with a regression model. We could calculate the L1 loss per training example and the result would look like this:

What is the L1 cost function?

The most common cost function to use in conjunction with the L1 loss function is MAE (Mean Absolute Error) which is the mean of all the L1 values.

What’s the difference between L1 loss and MAE cost function?

There can often be confusion around the difference between loss functions and cost functions, so let me explain further.

The L1 loss is an error calculation for each example where we want to understand how well we predicted for that observation, but what if we wanted to understand the error for the whole dataset? To do this we combine all the L1 loss values into a cost function called Mean Absolute Error (MAE) which, as the name suggests, is the mean of all the L1 loss values.

The formula for MAE is therefore:

Calculate L1 loss and MAE cost function in Python

L1 loss is the absolute difference between the actual and the predicted values, and MAE is the mean of all these values, and thus both are simple to implement in Python. I can show this with an example:

Calculate L1 loss and MAE cost using Numpy

import numpy as np

actual = np.array([10, 11, 12, 13])
prediction = np.array([10, 12, 14, 11])

l1_loss = abs(actual - prediction)

"""
Output:
[0 1 2 2]
"""

mae_cost = l1_loss.mean()

"""
Output:
1.25
"""

Should I use L1 loss function?

There are several loss functions that can be used in machine learning, so how do you know if L1 is the right loss function for your use case? Well, that depends on what you are seeking to achieve with your model and what is important to you, but there tends to be one decisive factor:

L1 loss is not sensitive to outliers as it is simply the absolute difference, so if you want to penalise large errors and outliers then L1 is not a great choice and you should probably use L2 loss instead. However, if you don't want to punish infrequent large errors, then L1 is most likely a good choice.

Loss function vs cost function, what’s the difference?

References

Numpy subtract arrays
Wikipedia article on Loss functions

Metrics

Stephen Allwright Twitter

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.