Expand a list when doing list comprehension in python

Question

Here is the code:

import itertools, time
x = {1:[1, 2], 2:[2, 3], 3:[3, 4]}
l = [1, 2, 3] * 10000000
start = time.time()
y = [x[i] for i in l]
y = list(itertools.chain.from_iterable(y))
print(y[:6])
print(time.time() - start)

start = time.time()
y = []
[y.extend(x[i]) for i in l]
print(y[:6])
print(time.time() - start)

The first approach is to allow the inner lists to be nested, then flatten them out when the comprehension is finished. The second approach is to construct a flat list during comprehension.

It seems the first approach is a bit faster:

3.8479559421539307
4.469805955886841

I am wondering if there are better (faster) ways to do it?

"Better" is in the eye of the beholder. Which facet of "better" are you looking for, faster? — NPE, Dec 23 '14 at 15:36
Another approach: make `y` a generator expression. Or use a double loop. But most of all, use `timeit` to conduct proper time trials instead. — Martijn Pieters, Dec 23 '14 at 15:38

score 2 · Answer 1 · answered Dec 23 '14 at 16:19

The timings are closer on my machine:

>>> x = {1: [1, 2], 2: [2, 3], 3: [3, 4]}
>>> l = [1, 2, 3] * 10000000
>>> import timeit
>>> import itertools

>>> def by_chaining():
...     y = [x[i] for i in l]
...     return list(itertools.chain.from_iterable(y))
...
>>> timeit.timeit(by_chaining, number=1)
5.491670315985385

>>> def by_extension():
...     y = []
...     [y.extend(x[i]) for i in l]
...     return y
...
>>> timeit.timeit(by_extension, number=1)
5.656423713273028

The thing is, though, the .extend approach is really meant to be used in an explicit loop, because it's a function with side effects. With the list comprehension, you're producing a separate list full of None values that is then thrown away, which adds quite a bit of overhead. See:

>>> def by_extension_loop():
...     y = []
...     for i in l: y.extend(x[i])
...     return y
...
>>> timeit.timeit(by_extension_loop, number=1)
4.62852763674681

Let's try some other approaches:

>>> def by_nested_comprehension():
...     return [e for i in l for e in x[i]]
...
>>> timeit.timeit(by_nested_comprehension, number=1)
5.102275806385393

Don't do this one. It gets the correct answer theoretically, but will take forever and uses excessive memory. (It's probably part of the reason why sum was explicitly special-cased to refuse to sum strings - although I kinda wish they'd redirected it to a more efficient technique instead).

>>> def by_sum():
...     return sum((x[i] for i in l), [])

How about giving from_iterable a generator to work with?

>>> def by_chaining_generator():
...     return list(itertools.chain.from_iterable(x[i] for i in l))
...
>>> timeit.timeit(by_chaining_generator, number=1)
5.420730297100135

How about using map instead of comprehensions?

>>> def by_chaining_map():
...     return list(itertools.chain.from_iterable(map(x.get, l)))
...
>>> timeit.timeit(by_chaining_map, number=1)
4.707590696974194

>>> def by_chaining_map_2():
...     return list(itertools.chain.from_iterable(map(x.__getitem__, l)))
...
>>> timeit.timeit(by_chaining_map_2, number=1)
4.576851915207953

Looks like that one might be a winner, but it's close. It's also possible to optimize the list.append version by pre-allocating and inserting slices, IIRC. But my creativity seems exhausted for the moment :)

score 1 · Answer 2 · answered Dec 23 '14 at 15:50

1

The answer is actually hidden in your question. A much faster way to achieve what you want is:

l = [1, 2, 3]
x = {1:[1, 2], 2:[2, 3], 3:[3, 4]}

y = []
for k in l:
    y.extend(x[k])

y *= 10000000

answered Dec 23 '14 at 15:50

Nasser Al-Shawwa

3,573
17
27

Or `y = [j for i in l for j in x[i]] * 10000000`, but multiplying the list may be a unreliable optimization. – matsjoyce Dec 23 '14 at 15:51
`l` is just a mock-up, it's not necessarily just repeating [1, 2, 3] – qed Dec 23 '14 at 15:51
Are the indices always stored in the list `l`? – Nasser Al-Shawwa Dec 23 '14 at 15:59
@matsjoyce why is it risky? – Nasser Al-Shawwa Dec 23 '14 at 16:00
The repeating sequence is just a mock up, so this won't work on the real data. – matsjoyce Dec 23 '14 at 16:11

Expand a list when doing list comprehension in python

2 Answers2