2

I was reading through some older code of mine and came across this line

itertools.starmap(lambda x,y: x + (y,), 
                  itertools.izip(itertools.repeat(some_tuple, 
                                                  len(list_of_tuples)),
                                 itertools.imap(lambda x: x[0],
                                                list_of_tuples)))

To be clear, I have some list_of_tuples from which I want to get the first item out of each tuple (the itertools.imap), I have another tuple that I want to repeat (itertools.repeat) such that there is a copy for each tuple in list_of_tuples, and then I want to get new, longer tuples based on the items from list_of_tuples (itertools.starmap).

For example, suppose some_tuple = (1, 2, 3) and list_of_tuples = [(1, other_info), (5, other), (8, 12)]. I want something like [(1, 2, 3, 1), (1, 2, 3, 5), (1, 2, 3, 8)]. This isn't the exact IO (it uses some pretty irrelevant and complex classes) and my actual lists and tuples are very big.

Is there a point to nesting the iterators like this? It seems to me like each function from itertools would have to iterate over the iterator I gave it and store the information from it somewhere, meaning that there is no benefit to putting the other iterators inside of starmap. Am I just completely wrong? How does this work?

Dan Oberlam
  • 2,435
  • 9
  • 36
  • 54
  • if you show us expected input and output will be good to map your what its doing – Hackaholic Nov 17 '14 at 04:57
  • No, for the simple reason that it makes the code too hard to follow. Keep It Simple – John Mee Nov 17 '14 at 05:05
  • @gnibbler be that as it may, even removing it still gets us nested iterators, so unless removing that changes the answer I'm still wondering if there is a point to nesting in general – Dan Oberlam Nov 17 '14 at 05:21
  • @Dannnno, If you are really careful it's possible to get performance gains using itertools. It's unlikely to be the case here though. – John La Rooy Nov 17 '14 at 05:24

2 Answers2

2

There is no reason to nest iterators. Using variables won't have a noticeable impact on performance/memory:

first_items = itertools.imap(lambda x: x[0], list_of_tuples)
repeated_tuple = itertools.repeat(some_tuple, len(list_of_tuples))
items = itertools.izip(repeated_tuple, first_items)
result = itertools.starmap(lambda x,y: x + (y,), items)

The iterator objects used and returned by itertools do not store all the items in memory, but simply calculate the next item when it is needed. You can read more about how they work here.

Community
  • 1
  • 1
grc
  • 22,885
  • 5
  • 42
  • 63
  • I think was really what my question was asking. I wasn't sure if by not nesting the iterators I would end up storing more things in memory than I wanted to. Thanks! – Dan Oberlam Nov 17 '14 at 05:39
1

I don't believe the combobulation above is necessary in this case.

it appears to be equivalent to this generator expression:

(some_tuple + (y[0],) for y in list_of_tuples)

However occasionally itertools can have a performance advantage especially in cpython

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • So ignoring the example I gave, which I can see is not a good one, in general will the nesting of iterators have any kind of effect, positive or otherwise, on the performance of the program? Or is this something that would have to be decided case by case? – Dan Oberlam Nov 17 '14 at 05:29
  • 2
    @Dannnno, you should prefer to write the code in the most readable way you can. If it turns out to be a performance bottleneck, you rewrite it in whatever way you can think is faster. Keep the easy to read version as documentation. You can write your unit tests against both versions to detect if the behaviour has accidentally changed in your faster version – John La Rooy Nov 17 '14 at 05:33