L2 loss is a loss function commonly used in machine learning. In this post I will explain what it is, how to implement it in Python, and some common questions that users have.
L2 loss function, what is it?
L2 loss, also known as Squared Error Loss, is the squared difference between a prediction and the actual value, calculated for each example in a dataset. The aggregation of all these loss values is called the cost function, where the cost function for L2 is commonly MSE (Mean of Squared Errors).
L2 loss function formula
The mathematical formula for calculating l2 loss is:
L2 loss function example
Let’s say we are predicting house prices with a regression model. We could calculate the L2 loss per training example and the result would look like this:
|Actual value||Predicted value||Difference||L2 loss|
Is L2 loss the same as MSE (Mean of Squared Errors)?
L2 loss and MSE are related, but not the same. L2 loss is the loss for each example, whilst MSE is the cost function which is an aggregation of all the loss values in the dataset.
Let me explain further.
The L2 loss is an error calculation for each example where we want to understand how well we predicted for that observation, but what if we wanted to understand the error for the whole dataset? To do this we combine all the L2 loss values into a cost function called Mean of Squared Errors (MSE) which, as the name suggests, is the mean of all the L2 loss values.
The formula for MSE is therefore:
Calculate L2 loss and MSE cost function in Python
L2 loss is the squared difference between the actual and the predicted values, and MSE is the mean of all these values, and thus both are simple to implement in Python. I can show this with an example:
Calculate L2 loss and MSE cost using Numpy
import numpy as np actual = np.array([10, 11, 12, 13]) prediction = np.array([10, 12, 14, 11]) l2_loss = (actual - prediction) ** 2 """ Output: [0 1 4 4] """ mse_cost = l2_loss.mean() """ Output: 2.25 """
Should I use L2 loss function?
There are several loss functions that can be used in machine learning, so how do you know if L2 is the right loss function for your use case? Well, that depends on what you are seeking to achieve with your model and what is important to you, but there tends to be one decisive factor:
L2 loss is very sensitive to outliers because it squares the difference, so if you want to penalise large errors and outliers then L2 is a great choice. However, if you don't want to punish infrequent large errors, then L2 is most likely not a good choice and you should probably use L1 loss instead.