Python - nested loop vs 'in' operator resulting in almost double runtime

Question

I come from a C++/Java background and have done a simple implementation of the 'TwoNumberSum' algorithm below. I first implemented it in a more C/Java way using the conventional nested loop then using the 'in' operator. I was expecting both to execute in similar time as 'in' should ideally also iterate through the list which should result in a nested loop and thus somewhere similar execution time but to my surprise the first algorithm takes double the time of the second. Can someone explain what is causing such huge gap in runtime?

I am getting below results on executing the snippet.

Execution Time of Algo 1: 1.023191

Execution Time of Algo 2: 0.46218059999999994

from timeit import timeit

# First Algorithm using simple lists/arrays
code1 = """
def TwoNumberSum(list, sum):

    for i in range(0, len(list)-1):
        for j in range(i+1, len(list)):
            if list[i] + list[j] == sum:
                return [list[i], list[j]]
    return []


TwoNumberSum([3, 5, -4, 8, 11, 1, -1, 6], 10)
"""
# Second Algorith similar to first using 'in' operator
code2 = """
def TwoNumberSum(list, sum):

    for i in range(0, len(list)-1):
        if sum-list[i] in list[i+1:-1]:
            return [list[i], sum-list[i]]
    return []

TwoNumberSum([3, 5, -4, 8, 11, 1, -1, 6], 10)
"""

print(f"Execution Time of Algo 1: {timeit(code1, number=100000)}")
print(f"Execution Time of Algo 2: {timeit(code2, number=100000)}")

Numpy basically offers a gateway to C-style iteration. You might find the discussion at this question useful: [numpy: function for simultaneous max() and min()](https://stackoverflow.com/questions/12200580/numpy-function-for-simultaneous-max-and-min/58043564#58043564) — Nathan Chappell, Feb 16 '20 at 09:23
Are you aware that Python itself isn't compiled, but most builtins are written in a different language? — MisterMiyagi, Feb 16 '20 at 09:28

John Zwinck · Accepted Answer · 2020-02-23T06:34:30.420

5

Python is an interpreted language, and for loops in it are notoriously slow. You probably expect that in is implemented as a for loop, and it is, but it's implemented natively (typically in C), not in Python. There are a lot of functions like that in Python, otherwise the whole language would be impractically slow (at least on the most common interpreter implementation, which is CPython).

If you care about performance in Python, you'll need to do everything you can to avoid Python loops that are executed many times. This is a big part of why NumPy exists and is written largely in C (because operating on vectors and arrays takes forever if you use for loops).

You can also write your own Python modules in C (or C++, or other languages which can expose C callable functions), and this is very useful if you need to accelerate custom loops.

Added to address comments from user "Heap Overflow":

If the input list is long, you want an linear time solution. Here is one:

def TwoNumberSum(nums, target):
    seen = set()
    for num in nums:
        complement = target - num
        if complement in seen:
            return num, complement
        seen.add(num)

This is actually slower for the short example lists given in the question, but will be faster for longer lists. But it still contains a for loop in Python, which means it won't be blazing fast. An easy way to fix that is to use Numba, which automatically translates a subset of Python to machine code:

import numba
TwoNumbaSum = numba.njit(TwoNumberSum)

Numba expects NumPy arrays rather than lists, so call it like this:

TwoNumbaSum(np.asarray([3, 5, -4, 8, 11, 1, -1, 6]), 10)

That's it, now you have a linear time solution with no for loops in Python, and you didn't have to write C or C++ code (which would run just as fast but require more effort to develop).

edited Feb 23 '20 at 06:34

answered Feb 16 '20 at 09:16

John Zwinck

239,568
38
324
436

You can also see using `dis.dis` there's less opcodes generated as well... – Jon Clements Feb 16 '20 at 09:26
Is there a linear time NumPy solution for this problem as well? (I don't know much NumPy.) – Kelly Bundy Feb 16 '20 at 09:29
2

@HeapOverflow numpy does not provide linear time solutions to square complexity problems - it just offers vastly faster square time. – zvone Feb 16 '20 at 09:43
@zvone It's not a square complexity problem. – Kelly Bundy Feb 16 '20 at 14:54
@zvone I guess the silence and comment-upvotes mean NumPy really can't do it in linear time and really needs quadratic time? Then I'd say NumPy is bad advice here, as it's going to be slower than the linear time pure Python solution fairly quickly. I'd do benchmarks, but as I said I don't know much NumPy, and I don't know how to write a good quadratic time NumPy solution. – Kelly Bundy Feb 21 '20 at 19:11
@HeapOverflow I'm afraid I don't know what you are saying. The code from the question has quadratic complexity. Whether the same can be done in linear time does not depend on whether you use numpy or some other tool. – zvone Feb 21 '20 at 19:25
@zvone Yes, the code from the question is bad. A better *algorithm* (linear time) is what I would've suggested. John didn't, instead suggesting NumPy or using C. No mention of better algorithms. What he describes sounds like only a constant speedup. Unless NumPy also allows a linear time solution (in C you can of course do it, but I'd first do it in Python and see whether that suffices). Now you say it doesn't depend on whether I use NumPy or not, so you're saying it *can* be done with NumPy in linear time? How? Like I said, I don't know much NumPy, and I'm interested in seeing that. – Kelly Bundy Feb 21 '20 at 19:38
@HeapOverflow Well, if you see a solution in linear time, you should give your own answer with it. I myself did not spend much time on this question. I just commented on your comment, to correct a possible misunderstanding about what numpy does. – zvone Feb 21 '20 at 19:51
@zvone It wouldn't be an answer, as the question isn't how to make it faster but why there's that difference between the OP's two solutions. Also, it's old and boring, for example it's [LeetCode's oldest problem](https://leetcode.com/problemset/all/) and I'm not interested in repeating it. What I *am* interested in is to learn how to do it in NumPy, hence my question. – Kelly Bundy Feb 21 '20 at 19:58
@HeapOverflow: I never suggested using NumPy for this particular problem, I only mentioned it as one approach to solving the Python "for loops are slow" issue. I've added a linear time solution for this problem using Numba, please see. There is probably no linear time solution using NumPy because NumPy is all about arrays, not hash tables. – John Zwinck Feb 23 '20 at 06:27
1

@JohnZwinck Great! I couldn't find anything about hash tables in NumPy, which is why I suspected there might not be a linear solution. So thanks for your assessment of this and for showing how to use Numba (which I had heard of but knew nothing about). – Kelly Bundy Feb 23 '20 at 12:37

Python - nested loop vs 'in' operator resulting in almost double runtime

1 Answers1