42

If I want to use only the index within a loop, should I better use the range/xrange function in combination with len()

a = [1,2,3]
for i in xrange(len(a)):
    print i 

or enumerate? Even if I won't use p at all?

for i,p in enumerate(a):
    print i    
LarsVegas
  • 6,522
  • 10
  • 43
  • 67
  • 8
    I'd be really curious what your use case is. – Sven Marnach Aug 10 '12 at 11:53
  • I came across some code where actually `enumerate` shouldn't have been used in the 1st place `[[profiel.attr[i].x for i,p in enumerate(profiel.attr)] for profiel in prof_obj]`. `p` isn't needed or it should be `[[p.attr.x for p in profiel.attr] for profiel in prof_obj]`. So I asked myself should rewrite the code one or the other way... – LarsVegas Aug 10 '12 at 12:18
  • This code should actually be `[[p.x for p in profiel.attr] for profiel in prof_obj]`. – Sven Marnach Aug 10 '12 at 12:28
  • True, my bad. Can't edit anymore, so thanks for straighten this out. – LarsVegas Aug 10 '12 at 12:32
  • @Sven Marnach, Recently I did some coding where I actually only needed the index to access slices of arrays like so:`sum_dist = [[sum(afst[:i]) for i,_ in enumerate(afst,start=1)] for afst in dist_betw]`. (Even though I know this construct isn't really needed as I could also use `itertools.accumlate()`.) – LarsVegas Sep 06 '12 at 06:44
  • That's a really bad way to compute an cumulative sum. You sum up the first elements over and over again, resulting in quadratic complexity for something that is inherently linear. When your are using Python 3.2 anyway, `itertools.accumulate()` is the obvious way. If using NumPy is an option, you can also use `numpy.cumsum()`. In all other cases, simply roll your own O(n) `cumsum()` function. – Sven Marnach Sep 06 '12 at 09:43
  • I didn't know it was so bad. Actually I thought about defining a `cumsum` function is the first place but then had this idea. Of course your right about the quadratic complexity but as I'm dealing with very small data sets it didn't really strike me as something that might burn me. Thanks for pointing it out to me though. – LarsVegas Sep 06 '12 at 12:29

8 Answers8

30

I would use enumerate as it's more generic - eg it will work on iterables and sequences, and the overhead for just returning a reference to an object isn't that big a deal - while xrange(len(something)) although (to me) more easily readable as your intent - will break on objects with no support for len...

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • Very interesting point. Which would be an example of an object that doesn't support `len()`? A function? – LarsVegas Aug 10 '12 at 12:09
  • 1
    @larsvegas `itertools.count(10)` which is a generator – jamylak Aug 10 '12 at 12:10
  • 2
    @jamylak: Note that `itertools.count(10)` is an infinte generator, so you don't want to enumerate it either. – Sven Marnach Aug 10 '12 at 12:23
  • 2
    @SvenMarnach ok then `itertools.islice(itertools.count(10, 2), 50, 100)` is probably a better example although you could just use maths to make an `xrange` from that. – jamylak Aug 10 '12 at 12:28
  • 2
    @jamylak: Yes, or `iter([])` for a more concise one. :) – Sven Marnach Aug 10 '12 at 12:59
  • The overhead on returning an object is about 10-15%. If you're going to access the data, use enumerate. If you only care about indices, range/xrange would be the faster solution. – ofer.sheffer Nov 20 '16 at 17:47
22

Using xrange with len is quite a common use case, so yes, you can use it if you only need to access values by index.

But if you prefer to use enumerate for some reason, you can use underscore (_), it's just a frequently seen notation that show you won't use the variable in some meaningful way:

for i, _ in enumerate(a):
    print i

There's also a pitfall that may happen using underscore (_). It's also common to name 'translating' functions as _ in i18n libraries and systems, so beware to use it with gettext or some other library of such kind (thnks to @lazyr).

Rostyslav Dzinko
  • 39,424
  • 5
  • 49
  • 62
  • 2
    Beware of using this idiom in combination with [`gettext`](http://docs.python.org/library/gettext.html#gettext.install) though, because it uses the `_` variable for something else, and this use would shadow the `gettext` `_` within the current namespace. It could lead to strange bugs. – Lauritz V. Thaulow Aug 10 '12 at 11:58
  • @lazyr That behaviour is only in Python 2.x I'm pretty sure – jamylak Aug 10 '12 at 12:03
  • 2
    @jamylak [Nope](http://docs.python.org/py3k/library/gettext.html#gettext.install) – Lauritz V. Thaulow Aug 10 '12 at 12:05
  • @lazyr Oh right I was remembering the list comprehension leaking was changed in 3.x my bad. – jamylak Aug 10 '12 at 12:08
  • 2
    The most important reason not to use `_` as a variable name is that people have all sorts of strange misconceptions about it and tend to mistake it for some kind of special syntax. I've seen *lots* of people being confused by this, so I'd simply avoid this confusion by calling it `dummy`. Explicit is better than implicit. – Sven Marnach Aug 10 '12 at 12:25
  • @SvenMarnach `_` actually looks nice to me though, probably why they use it in Haskell. I would rather simply do `for i, el` then `for i, dummy` which is what I've been doing recently when i've had dummy values anyway. – jamylak Aug 10 '12 at 12:51
  • @jamylak: In Haskell and, say, Go the underscore has a special meaning. It's part of the language and people need to learn it anyway. In Python, using `_` isn't special, so the confusion it causes is unnecessary. (In addition, it clashes with the `gettext` alias and the underscore in the interactive interpreter. A single disadvantage without any advantage should be enough as an argument not to use it, though.) – Sven Marnach Aug 10 '12 at 13:05
  • @SvenMarnach Good point, although I still don't like the `dummy` name, it would look nicer to call it `el` and I think that would by fine even if it's not used. – jamylak Aug 10 '12 at 13:08
  • 1
    @jamylak: The latter is also what I do with unused names. The suggestion to call it `dummy` is only because people tend to argue that `_` makes it clear that the variable is a dummy variable. (Why this would be "clear" is their secret.) Moreover, some IDEs warn about unused variables, and there are usually some name pattern they ignore, like `unused_xxx` or similar. – Sven Marnach Aug 10 '12 at 13:14
  • 2
    @jamylak. My five cents about IDEs. Eclipse-PyDEV/Aptana has exception for underscore by default when looks for unused names (it doesn't take it in count). – Rostyslav Dzinko Aug 10 '12 at 13:17
17

That's a rare requirement – the only information used from the container is its length! In this case, I'd indeed make this fact explicit and use the first version.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
4

xrange should be a little faster, but enumerate will mean you don't need to change it when you realise that you need p afterall

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
3

I ran a time test and found out range is about 2x faster than enumerate. (on python 3.6 for Win32)

best of 3, for len(a) = 1M

  • enumerate(a): 0.125s
  • range(len(a)): 0.058s

Hope it helps.

FYI: I initialy started this test to compare python vs vba's speed...and found out vba is actually 7x faster than range method...is it because of my poor python skills?

surely python can do better than vba somehow

script for enumerate

import time
a = [0]
a = a * 1000000
time.perf_counter()

for i,j in enumerate(a):
    pass

print(time.perf_counter())

script for range

import time
a = [0]
a = a * 1000000
time.perf_counter()

for i in range(len(a)):
    pass

print(time.perf_counter())

script for vba (0.008s)

Sub timetest_for()
Dim a(1000000) As Byte
Dim i As Long
tproc = Timer
For i = 1 To UBound(a)
Next i
Debug.Print Timer - tproc
End Sub
RichieV
  • 5,103
  • 2
  • 11
  • 24
1

I wrote this because I wanted to test it. So it depends if you need the values to work with.

Code:

testlist = []
for i in range(10000):
    testlist.append(i)

def rangelist():
    a = 0
    for i in range(len(testlist)):
        a += i
        a = testlist[i] + 1   # Comment this line for example for testing

def enumlist():
    b = 0
    for i, x in enumerate(testlist):
        b += i
        b = x + 1   # Comment this line for example for testing

import timeit
t = timeit.Timer(lambda: rangelist())
print("range(len()):")
print(t.timeit(number=10000))
t = timeit.Timer(lambda: enumlist())
print("enum():")
print(t.timeit(number=10000))

Now you can run it and will get most likely the result, that enum() is faster. When you comment the source at a = testlist[i] + 1 and b = x + 1 you will see range(len()) is faster.

For the code above I get:

range(len()):
18.766527627612255
enum():
15.353173553868345

Now when commenting as stated above I get:

range(len()):
8.231641875551514
enum():
9.974262515773656
user136036
  • 11,228
  • 6
  • 46
  • 46
  • I figure you should add a helper note that this test shows how enumerate is faster when you access the elements of the list and range(len) is faster when you don't. – ofer.sheffer Nov 20 '16 at 17:41
0

Based on your sample code,

res = [[profiel.attr[i].x for i,p in enumerate(profiel.attr)] for profiel in prof_obj]

I would replace it with

res = [[p.x for p in profiel.attr] for profiel in prof_obj]
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
-2

Just use range(). If you're going to use all the indexes anyway, xrange() provides no real benefit (unless len(a) is really large). And enumerate() creates a richer datastructure that you're going to throw away immediately.

Rajesh J Advani
  • 5,585
  • 2
  • 23
  • 35
  • 6
    xrange() provides really great benefit! It doesn't create temporary list in memory, it's a generator – Rostyslav Dzinko Aug 10 '12 at 11:56
  • Not for this requirement. The OP is just creating a range of the index. – Rajesh J Advani Aug 10 '12 at 11:57
  • 3
    @RajeshJAdvani No he is iterating through and printing them one by one. – jamylak Aug 10 '12 at 11:58
  • Be that as it may, it's just a list of numbers. But yes, if it's a really large array, then `xrange` would be useful. Updated my answer to reflect that. – Rajesh J Advani Aug 10 '12 at 12:00
  • 2
    @RostyslavDzinko `xrange` is not a generator. It is a sequence object which lazily evaluates. – jamylak Aug 10 '12 at 12:01
  • Also, I'm assuming `print i` was just an example. That's not a real requirement. – Rajesh J Advani Aug 10 '12 at 12:02
  • @jamylak looked at CPython source, you're right. xrange() is an iterator with internal index, but not generator. Thank's a lot for your comment! – Rostyslav Dzinko Aug 10 '12 at 12:10
  • @RostyslavDzink Actually the iterator is `rangeiterator` which iterates over the `xrange` sequence. When you write `for i in xrange(3)` for example, it calls `iter(xrange(3))` to get the `rangeiterator` – jamylak Aug 10 '12 at 12:16
  • @jamylak Yeah, I've got that, talking 'iterator' I was uncertain, thus generators are iterators too =), rangeiterator as iterator interface *implementation* and it is more precise answer for xrange, thanks again). – Rostyslav Dzinko Aug 10 '12 at 12:20
  • @RostyslavDzinko It shouldn't matter as long as it's iterable in Python due to duck typing but they are distinct terms. `>>> import types, collections` `>>> isinstance(xrange(3), collections.Iterator)` `False` `>>> isinstance(iter(xrange(3)), collections.Iterator)` `True` `>>> isinstance(xrange(3), types.GeneratorType)` `False` `>>> isinstance(xrange(3), collections.Sequence)` `True` – jamylak Aug 10 '12 at 12:59