Why Join() Is Faster Than Iteration?

A reminder to always prefer specific methods over a generalized approach.

There are two popular ways to concatenate multiple strings:

  1. Iterating and appending them to a single string.

  2. Using Python’s in-built join() method.

But as shown below, the 2nd approach is significantly faster than the 1st approach:

Can you answer why?

The answer is not vectorization!

Continue reading to learn more.

When concatenating using iteration, Python naively executes the instructions it comes across.

Thus, it does not know (beforehand):

  • number of strings it will concatenate

  • number of white spaces it will need

Simply put, iteration inhibits any scope for optimization.

As a result, during every iteration, Python asks for a memory allocation of:

  • the string at the current iteration

  • the white space added as a separator

This leads to repeated calls to memory.

To be precise, the number of calls in this case is two times the size of the list.

But this is not the case when we use join().

Because in that case, Python precisely knows (beforehand):

  • number of strings it will be concatenating

  • number of white spaces it will need

All these are applied for allocation in a single call and are available upfront before concatenation.

To summarize:

  • with iteration, the number of memory allocation calls is 2x the list's size.

  • with join(), the number of memory allocation calls is just one.

This explains the significant difference in their run-time we noticed earlier.

This post is also a reminder to ALWAYS prefer specific methods over a generalized approach.

These subtle sources of optimization can lead to profound improvements in run-time and memory utilization of your code.

👉 Over to you: What other ways do you commonly use to optimize native Python code?

Are you overwhelmed with the amount of information in ML/DS?

Every week, I publish no-fluff deep dives on topics that truly matter to your skills for ML/DS roles.

For instance:

Join below to unlock all full articles:

SPONSOR US

Get your product in front of 85,000 data scientists and other tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., who have influence over significant tech decisions and big purchases.

To ensure your product reaches this influential audience, reserve your space here or reply to this email to ensure your product reaches this influential audience.

Reply

or to participate.