Daily Dose of Data Science
Posts
You Were (Most Probably) Given Incomplete Info About How Python Dictionaries Work

You Were (Most Probably) Given Incomplete Info About How Python Dictionaries Work

Understanding the lesser-talked internal workings of a Python dictionary.

January 09, 2024 • Reading Time: 8 minutes

While Python is probably one of the easiest programming languages to begin with, there are MANY things that make it quite weird and counterintuitive at times.

Today, I want to share one peculiarity about Python dictionaries, which most Python programmers aren’t aware of.

Let’s begin!

Consider we declare the following Python dictionary:

We add four keys during this dictionary declaration:

Integer type 1
Float type 1.0
Boolean type True
String type ‘1’

These four keys are four different objects, which we can also verify by printing their respective object IDs using the id() method:

Thus, one might expect that the final dictionary must have four keys.

However, when we print the above dictionary, we notice that there are only two keys:

Where did the other two keys go?

What are we missing here?

Internal working of Python dictionary

Most Python programmers believe that dictionaries find/insert/delete a key based on the ID of the object being added as a key.

But this is not true.

Instead, dictionaries do this using hash equivalence — one that is computed using the hash() method.

Thus, if two objects have the same hash value, then a dictionary will consider them as identical keys.

This is precisely what we notice with ‘Integer 1’, ‘Float 1.0’, and ‘Boolean True’ objects.

As depicted below, these three objects have the same hash value:

As a result, a Python dictionary considers them as identical keys, even though they are entirely different objects.

One thing to notice here is that the final dictionary maintains the ‘Integer 1’ key, but its value is the one corresponding to ‘Boolean True’ key.

This happens because of the order in which we specified the keys during dictionary declaration:

Let’s break down the steps:

When the dictionary object is created, the dictionary is empty. From here on, the keys will be added one by one.

First, the ‘Integer 1’ is added, and the dictionary becomes this:

Next, while adding ‘Float 1.0’, Python finds its hash equivalence with the existing key of ‘Integer 1’. Thus, the existing key (‘Integer 1’) is maintained, but its value is updated:

Moving on, while adding ‘Boolean True’, another hash equivalence takes place with the existing key of ‘Integer 1’. Yet again, the existing key (‘Integer 1’) is maintained by its value is updated:

Finally, while adding ‘String 1’, no hash equivalence takes place, and a new key is appended, giving us the following the dictionary:

We can validate this reasoning by changing the order in which we specified the keys during dictionary declaration:

This time, we get a different output, but it is coherent with the reasoning above.

Takeaway

Contrary to common belief, Python dictionaries never find/insert/delete keys based on object identities. Instead, they use hash equivalence.

Check out these newsletter issues next to learn about some more Python peculiarities or misinterpreted technicalities:

👉 Over to you: What are some other Python technical concepts that most programmers misinterpret?

👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights.

The button is located towards the bottom of this email.

Thanks for reading!

Latest full articles

If you’re not a full subscriber, here’s what you missed last month:

To receive all full articles and support the Daily Dose of Data Science, consider subscribing:

👉 Tell the world what makes this newsletter special for you by leaving a review here :)

👉 If you love reading this newsletter, feel free to share it with friends!

Reply

or to participate.