2

I participate in coding competitions and am curious about the comparative speeds of different loops - specifically, the for loop (for i in range(...) versus the for item loop (for item in myArray).

To test this, I wrote the code below:

import time

def ForLoop():
    for i in range(len(array)):
        item = array[i]

def ForItemLoop():
    for item in array:
        a = item

array = [i for i in range(100000)]

t0 = time.monotonic_ns()
ForLoop()
t1 = time.monotonic_ns()
print("For loop time: (nanoseconds) " + str(t1-t0))
t0 = time.monotonic_ns()
ForItemLoop()
t1 = time.monotonic_ns()
print("For item loop time (nanoseconds) " + str(t1-t0))

(I wrote the a = item in ForItemLoop() because writing pass might mess things up)

The output (on Jupyter Notebook) was:

For loop time: (nanoseconds) 22432125
For item loop time (nanoseconds) 3757917

Why is the for item loop so much faster? The a = item part seems to make it super redundant, yet it still is 5 times faster?

Ryan Fu
  • 349
  • 1
  • 5
  • 22
  • Use `timeit` for measuring performances, not a single run of a function. That will give you averages. – 9769953 Jul 27 '21 at 06:01
  • 2
    Oops, I didn't know about that. Still, the gap is so large (and I ran the code multiply times, For loop was always ~2x10^7, For Item was always ~3x10^6), and the disparity is so much it's probably not a matter of averages. – Ryan Fu Jul 27 '21 at 06:04
  • It will be faster indeed; timeit won't change that. Just for future notice. – 9769953 Jul 27 '21 at 06:04
  • I can't explain why it's faster, but is there a reason you may need the indexing part? The `for item in array` loop also reads better (subjective), and avoids a potential indexing problem, so if you can avoid the indexing for-loop, all the better. – 9769953 Jul 27 '21 at 06:06
  • Just look at the two loops. In `for i in range` you have: two function calls: `range` and `len` and inside the loop, each iteration you index the list `array[i]`. In the second loop you simply iterate over an iterator on the list... No function calls and no indexing – Tomerikoo Jul 27 '21 at 06:08
  • The indexing needs to be able to handle an IndexError, while the second loop does not have to do this. – 9769953 Jul 27 '21 at 06:09
  • Just from looking at the [disasm](https://stackoverflow.com/a/10930151/6045800) it's easy to see why... – Tomerikoo Jul 27 '21 at 06:12
  • Although random access to a list element is `O(1)` in Python, there is still a constant time cost associated with it. Update: It's the `BINARY_SUBSCR` instruction in the disassembly posted by @Tomerikoo – Selcuk Jul 27 '21 at 06:14
  • @9769953 What do you mean? If there is an `IndexError` it is being raised, the loop doesn't handle it by itself... – Tomerikoo Jul 27 '21 at 06:14
  • @Tomerikoo That is indeed the best answer, although it is for Python 2. Would be nice if there is an update for a recent Python version. – 9769953 Jul 27 '21 at 06:14
  • @Tomerikoo Sorry, I mean to it needs to check for an IndexError, not so much handle it. – 9769953 Jul 27 '21 at 06:15
  • @9769953 Again, it doesn't *"check"*. If there is an `IndexError` it is being raised. As long as there is no error, it doesn't make a difference... – Tomerikoo Jul 27 '21 at 06:16
  • 1
    @Tomerikoo It does need to check whether the index is out of bounds, right? You're accessing the array with an index, and an IndexError might be raised, so there should be some check to see that the access is within range. – 9769953 Jul 27 '21 at 06:18
  • Error handling is about [asking forgiveness, not permission](https://stackoverflow.com/questions/12265451/ask-forgiveness-not-permission-explain). You simply access. If it's out of bounds an error will raised to stop you. As long as everything is fine, you simply access the list items. No check is required. – Tomerikoo Jul 27 '21 at 06:20
  • 1
    @Tomerikoo For your first comment, I decided to change the len to just the length of the list, 100000, and it still had pretty much the same time. So is range() eating up that much time? – Ryan Fu Jul 27 '21 at 06:21
  • 1
    @Tomerikoo I disagree: if there is no underlying if-clause, there can never be an IndexError raised. The if-clause may be very fast, use branch prediction and other optimisations, but there still needs to be a check that `array[i]` is actual an element in the array, not outside of it. – 9769953 Jul 27 '21 at 06:21
  • @9769953 Oh I see you're talking about actual C-level. I was looking at it differently. But in that case, then in the `for item` loop there is a `StopIteration` error in the end which surely must be produced the same way... – Tomerikoo Jul 27 '21 at 06:26
  • @Tomerikoo That is probably the case; I'm not sure how that actually works with `range`, since start and end (and step) are known for that loop beforehand. – 9769953 Jul 27 '21 at 06:36
  • @RyanFu Regarding your last comment *"So is range() eating up that much time?"* - those functions are probably not the critical part. They are only called once before the loop starts. The big difference most likely comes from the indexing. As mentioned in other comments, although it is constant time, it still more time as opposed to the simple iterator – Tomerikoo Jul 27 '21 at 15:31

0 Answers0