The Counterintuitive Behaviour of Training Accuracy and Training Loss

A decrease in loss may not mean an increase in accuracy.

Intuitively, the training accuracy and loss are expected to be always inversely correlated.

It is expected that better predictions should lead to a lower training loss.

But that may not always be true.

In other words, you may see situations where the training loss decreases. Yet, the training accuracy remains unchanged (or even decreases).

But how could that happen?

Firstly, understand that during training, the network ONLY cares about optimizing the loss function.

It does not care about the accuracy at all.

Accuracy is an external measure we introduce to measure the model's performance.

Now, when estimating the accuracy, the only thing we consider is whether we got a sample right or not.

Think of accuracy on a specific sample as a discrete measure of performance. Either right or wrong. That's it.

In other words, accuracy does not care if the network predicts a dog with 0.6 probability or 0.99 probability. They both have the same meaning (assuming the classification threshold is, say, 0.5).

However, the loss function is different.

It is a continuous measure of performance.

It considers how confident the model is about a prediction.

Thus, the loss function cares if the network predicted a dog with 0.6 probability or 0.99 probability.

This, at times, leads to the counterintuitive behavior of decreasing loss yet stable (or decreasing) accuracy.

If the training loss and accuracy both are decreasing, it may mean that the model is becoming:

  • More and more confident on correct predictions, and at the same time...

  • Less confident on its incorrect predictions.

  • But overall, it is making more mistakes than before.

If the training loss is decreasing, but the accuracy is stable, it may mean that the model is becoming:

  • More confident with its predictions. Given more time, it should improve.

  • But currently, it is not entirely confident to push them on either side of the probability threshold.

Having said that, remember that these kinds of fluctuations are quite normal, and you are likely to experience them.

The objective is to make you understand that while training accuracy and loss do appear to be seemingly negatively correlated, you may come across such counterintuitive situations.

Over to you: What do you think could be some other possible reasons for this behavior?

Post your answer in the replies.

πŸ‘‰ Read what others are saying about this post on LinkedIn.

πŸ‘‰ Tell the world what makes this newsletter special for you by leaving a review here :)

πŸ‘‰ If you liked this post, don’t forget to leave a like ❀️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.

πŸ‘‰ If you love reading this newsletter, feel free to share it with friends!

Find the code for my tips here: GitHub.

I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn and Twitter.

Reply

or to participate.