Daily Dose of Data Science
Posts
You Will Never Forget Precision and Recall If You Use the Mindset Technique

You Will Never Forget Precision and Recall If You Use the Mindset Technique

A simple and intuitive guide to precision and recall.

January 14, 2024 • Reading Time: 6 minutes

I have seen many folks struggling to intuitively understand Precision and Recall.

These fairly straightforward metrics often intimidate many.

Yet, adopting the “mindset technique” can be incredibly helpful.

Let me walk you through it today.

For simplicity, we’ll call the "Positive class" as our class of interest.

Precision

Formally, Precision answers the following question:

“What proportion of positive predictions were actually positive?”

Let’s understand that from a mindset perspective.

When we are in a Precision Mindset, we don’t care about getting every positive sample classified correctly.

But it’s important that every positive prediction we get should actually be positive.

For instance, consider a book recommendation system where a positive prediction means we like the recommended book.

In a Precision Mindset, we are okay if the model does not recommend all good books in the world.

But what it recommends should be good.

So even if this system recommended only one book and we liked it, this gives a Precision of 100%.

This is because what it classified as “Positive” was indeed “Positive.”

To summarize, in a high-precision mindset, all positive predictions should actually be positive.

Recall

Recall is a bit different. It answers the following question:

“What proportion of actual positives was identified correctly by the model?”

When we are in a Recall Mindset, we care about getting each and every positive sample correctly classified.

It’s okay if some positive predictions were not actually positive.

However, all positive samples should be classified as positive.

For instance, consider an interview shortlisting system based on their resume. A positive prediction means that the candidate should be invited for an interview.

In a Recall Mindset, you are okay if the model selects some incompetent candidates.

But it should not miss out on inviting any skilled candidate.

So even if this system says that all candidates (good or bad) are fit for an interview, it gives us a Recall of 100%.

This is because it didn’t miss out on any of the positive samples.

Which metric to choose entirely depends on what’s important to the problem at hand:

Optimize Precision if:

We care about getting ONLY high quality positive predictions.
We are okay if some quality (or positive) samples are left out.

Optimize Recall if:

We care about getting ALL quality (or positive) samples correct.
We are okay if some non-quality (or negative) samples also come along.

But practically speaking, solely optimizing for high recall is not that useful.

This is because even if we implement a basic function that always predicts “positive,” the classifier will still have a 100% Recall. This is because our naive model correctly classifies all positive instances.

However, optimizing for high Precision does require some engineering effort.

I hope that was helpful :)

👉 Over to you: What analogy did you first use to understand Precision and Recall?

👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights.

The button is located towards the bottom of this email.

Thanks for reading!

Latest full articles

If you’re not a full subscriber, here’s what you missed last month:

To receive all full articles and support the Daily Dose of Data Science, consider subscribing:

👉 Tell the world what makes this newsletter special for you by leaving a review here :)

👉 If you love reading this newsletter, feel free to share it with friends!

Reply

or to participate.