A Misconception About Pandas Inplace

The counterintuitive behaviour of inplace operations.

Most Pandas users have a misconception about inplace operations.

They profoundly use them in expectation of:

  • Smaller run-time

  • Lower memory usage

And, of course, the reasoning makes intuitive sense as well.

Inplace, as the name suggests, must modify the DataFrame without creating a new copy. Thus, it is okay to expect that inplace will be more efficient.

Yet, this is rarely the case, which is also evident from the image below:

It is clear that in most cases, inplace operations are slow.

Why does this happen?

Contrary to common belief, Pandas’ inplace operations NEVER prevent the creation of a new copy.

It is just that these operations assign the copy back to the same address.

But during this assignment step, Pandas has to perform some additional checks — SettingWithCopy, for instance, to ensure that the DataFrame is being modified correctly.

This, at times, can be an expensive operation.

Yet, in general, there is no guarantee that an inplace operation is faster, which is also validated by the results above.

What’s more, one thing I particularly dislike about inplace operations is that they inhibit method chaining as depicted below:

As a result, I never prefer using inplace operations in Pandas.

👉Over to you: Despite this, are there still any situations where you prefer using inplace operations in Pandas?

  • 1 Referral: Unlock 450+ practice questions on NumPy, Pandas, and SQL.

  • 2 Referrals: Get access to advanced Python OOP deep dive.

  • 3 Referrals: Get access to the PySpark deep dive for big-data mastery.

Get your unique referral link:

Are you overwhelmed with the amount of information in ML/DS?

Every week, I publish no-fluff deep dives on topics that truly matter to your skills for ML/DS roles.

For instance:

Join below to unlock all full articles:

SPONSOR US

Get your product in front of 79,000 data scientists and other tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., who have influence over significant tech decisions and big purchases.

To ensure your product reaches this influential audience, reserve your space here or reply to this email to ensure your product reaches this influential audience.

Reply

or to participate.