A.I, Data and Software Engineering

# Advanced Keras – Custom loss functions

A

When working on machine learning problems, sometimes you want to construct your own custom loss function(s). This article will introduce abstract Keras backend for that purpose.

### Keras loss functions

From Keras loss documentation, there are several built-in loss functions, e.g. mean_absolute_percentage_error, cosine_proximity, kullback_leibler_divergence etc. When compiling a Keras model, we often pass two parameters, i.e. optimizer and loss as strings:

loss: String (name of objective function) or objective function or Loss instance. Note that if the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses.

Next, we will step by step discover how to create and use custom loss function. Later, we apply one cost function for predicting fuel efficiency (Miles Per Gallon – MPG) from Auto MPG dataset.

### A Simple custom loss function

To keep our very first custom loss function simple, I will use the original “mean square error”, later we will modify it.

$${MSE}=\frac{1}{n}\sum_{i=1}^n(Y_i-\hat{Y_i})^2$$

Now for the tricky part: Keras loss functions must only take (y_true, y_pred) as parameters. So we need a separate function that returns another functionPython decorator factory. The code below shows that the function my_mse_loss() return another inner function mse(y_true, y_pred):

That is it! Now we can use it while compiling our model.

### A custom loss function with parameters

If you want the loss function to take other parameters, you can pass it to the factory.

Important note: Even Keras and TensorFlow accept numpy arrays, it is highly recommended to keep everything in its kingdom. Specifically, we should try to use the equivalent data type provided by the current library. Try not to mix types!

The following code is NOT recommended!

### More than one loss function in one model

Sometimes, we may need to handle more than one output of our model. Consider the following example:

In the graph, A and B layers share weights. Some models may have only one input layer as the root of the two branches.

• loss1 will affect A, B, and C.
• loss2 will affect A, B, and D.

You can read this paper which two loss functions are used for graph embedding or this article for multiple label classification. We will generalize some steps to implement this:

1. Create a model with n outputs
2. Create n loss functions
3. Pass n loss functions while compiling the model as a list or a dictionary.

Example code:

You can also pass a dictionary of loss as long as you assign a name for the layer that you want to apply the loss before you can use the dictionary. For example, we name the output of branch one as b1_output and use it as the key for the dictionary.

Let try it on Auto MPG dataset.

### Enable TF2.0 and load data

The ipython is created with Google Colab:

Import libraries

Load dataset using keras.utils and load the data to Pandas data frame.

MPGCylindersDisplacementHorsepowerWeightAccelerationModel YearOrigin
39327.04140.086.02790.015.6821
39444.0497.052.02130.024.6822
39532.04135.084.02295.011.6821
39628.04120.079.02625.018.6821
39731.04119.082.02720.019.4821

### Clean, split, and normalize data

The  column "Origin" is really categorical (not numeric). To eliminate the linear relations between them, we convert that to a one-hot:

MPGCylindersDisplacementHorsepowerWeightAccelerationModel YearUSAEuropeJapan
39327.04140.086.02790.015.6821.00.00.0
39444.0497.052.02130.024.6820.01.00.0
39532.04135.084.02295.011.6821.00.00.0
39628.04120.079.02625.018.6821.00.00.0
39731.04119.082.02720.019.4821.00.00.0

Now split the dataset into a training set (80%) and a test set (20%) by setting frac=0.8. We will use the test set in the final evaluation of our model.

Let visualize the data:

We separate the target value, or “label”, from the features. This label is the value that you will train the model to predict. It is good practice to normalize features that use different scales and ranges to make training easier.

### Build a model with custom loss

We use one cost function that we created earlier, i.e. my_mse_loss.

### Conclusion

A loss function(s) (or objective function, or optimization score function) is one of the two parameters required to compile a model. You can create customs loss functions for specific purposes alongside built-in ones. In part 2, we will continue with multiple metric functions.

• Anonymous says:

Thank you a Tung! Super good sharing!

• tungnd says:

You are welcome! 🙂

• David Signh says:

That is what I am looking for. Really struggling with this. Thanks for sharing.

A.I, Data and Software Engineering

PetaMinds focuses on developing the coolest topics in data science, A.I, and programming, and make them so digestible for everyone to learn and create amazing applications in a short time.