52

Suppose I have a function like this:

def getNeighbors(vertex)

which returns a list of vertices that are neighbors of the given vertex. Now I want to create a list with all the neighbors of the neighbors. I do that like this:

listOfNeighborsNeighbors = []
for neighborVertex in getNeighbors(vertex):
    listOfNeighborsNeighbors.append(getNeighbors(neighborsVertex))

Is there a more pythonic way to do that?

Chris Martin
  • 30,334
  • 10
  • 78
  • 137
Björn Pollex
  • 75,346
  • 28
  • 201
  • 283
  • I think both the duplicate and this question choose the wrong answer, though. [See here for the more pythonic/performant answer.](https://stackoverflow.com/a/953097/365102) – Mateen Ulhaq Nov 30 '18 at 01:52

7 Answers7

76

As usual, the itertools module contains a solution:

>>> l1=[1, 2, 3]

>>> l2=[4, 5, 6]

>>> l3=[7, 8, 9]

>>> import itertools

>>> list(itertools.chain(l1, l2, l3))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Jochen
  • 871
  • 5
  • 10
53
[x for n in getNeighbors(vertex) for x in getNeighbors(n)]

or

sum(getNeighbors(n) for n in getNeighbors(vertex), [])
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • +1 I was going to suggest a list comprehension. IMHO, it's the most pythonic way. – Evan Plaice Jun 11 '10 at 10:36
  • 7
    However, see the timing comparisons, as comments under emu's answer: both "itertools.chain" and "reduce(iadd" are more than twice as fast as the nested list comprehension -- and MUCH faster than sum(), which degrades rapidly with # elements processed. – ToolmakerSteve Dec 20 '13 at 01:07
  • So glad I found this. Tried many times, never with such a 2nd argument `[]` to the sum of lists. – Guillaume Chevalier Aug 23 '18 at 03:24
  • 2
    Second solution looks very cool. And works in practice. And it cost me hours of profiling and debugging because it just doesn't work for large N! Please put a note that the second solution has quadratic time complexity! – Maxim Imakaev May 18 '21 at 22:45
43

Appending lists can be done with + and sum():

>>> c = [[1, 2], [3, 4]]
>>> sum(c, [])
[1, 2, 3, 4]
Sjoerd
  • 74,049
  • 16
  • 131
  • 175
  • 1
    Thanks - I *knew* there had to be some way to do this with sum! BTW, it wasn't clear to me that this would work with more than 2 sub-lists, or variable length lists; so clearer example might be: `c = [[1, 2], [3, 4, 5], [6, 7]]` => `[1, 2, 3, 4, 5, 6, 7]` – ToolmakerSteve Dec 19 '13 at 22:37
  • 9
    BUT see the timings I did as comments under emu's answer. **DO NOT USE SUM -- VERY SLOW** FOR 100 lists of 100 items! – ToolmakerSteve Dec 20 '13 at 00:59
  • 1
    Why is the second argument to sum required? I would think sum([[1, 2], [3, 4]]) was clear as day to mean [1, 2] + [3, 4]. – KeithWM Oct 10 '18 at 20:06
  • 1
    @KeithWM Because `sum([[1, 2], [3, 4]])` does not mean `[1, 2] + [3, 4]`, but rather `0 + [1, 2] + [3, 4]`, which doesn't work. You need the optional second argument to replace that starting `0` with a `[]`, so that `sum([[1, 2], [3, 4]], [])` is `[] + [1, 2] + [3, 4]`. – Stef Jul 08 '21 at 12:01
  • @Stef Thank you very much! That explains a lot of errors I've experienced in the past while using sum. – KeithWM Jul 08 '21 at 12:29
15

Quickest to slowest:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain.from_iterable(list_of_lists))
30 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit list(itertools.chain(*list_of_lists))
33.4 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

min(timeit.repeat("ll=[];\nfor l in list_of_lists:\n ll.extend(l)", "list_of_lists=[[x,1] for x in range(1000)]",repeat=3, number=100))/100.0
4.1411130223423245e-05

%timeit [y for z in list_of_lists for y in z]
53.9 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit sum(list_of_lists, [])
1.5 ms ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

(Python 3.7.10)

Python2:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain(*list_of_lists))
100000 loops, best of 3: 14.6 µs per loop

%timeit list(itertools.chain.from_iterable(list_of_lists))
10000 loops, best of 3: 60.2 µs per loop

min(timeit.repeat("ll=[];\nfor l in list_of_lists:\n ll.extend(l)", "list_of_lists=[[x,1] for x in xrange(1000)]",repeat=3, number=100))/100.0
9.620904922485351e-05

%timeit [y for z in list_of_lists for y in z]
10000 loops, best of 3: 108 µs per loop

%timeit sum(list_of_lists, [])
100 loops, best of 3: 3.7 ms per loop
Yariv
  • 381
  • 3
  • 11
  • `itertools.chain(list_of_lists)` is wrong (it won't concatenate anything because it's only given one parameter). You need a `*` there, or `chain.from_iterable`. – interjay May 11 '17 at 14:30
  • 3
    These timing results might be obsolete. Testing on 2018 HW with python3.6.6, I don't see any reproducible speed difference between the itertools.chain, itertools.chain.from_iterable, and functools.reduce/iadd solutions. YMMV. The iadd solution changes the inputs, though. – Amnon Harel Aug 09 '18 at 08:40
  • nobody said anything about the list of lists being dynamic so the from_iterable solution is off-topic imo – Benjamin Atkin Oct 24 '22 at 01:04
13

If speed matters, it may be better to use this:

from operator import iadd
reduce(iadd, (getNeighbors(n) for n in getNeighbors(vertex)))

The point of this code is in concatenating whole lists by list.extend where list comprehension would add one item by one, as if calling list.append. That saves a bit of overhead, making the former (according to my measurements) about three times faster. (The iadd operator is normally written as += and does the same thing as list.extend.)

Using list comprehensions (the first solution by Ignacio) is still usually the right way, it is easier to read.

But definitely avoid using sum(..., []), because it runs in quadratic time. That is very impractical for many lists (more than a hundred or so).

emu
  • 1,597
  • 16
  • 20
  • Thanks for the comment re sum's performance -- I liked how compact that code is, so good to know not to use it on large scale. IMHO, Jochen's itertools'chain solution from '10 is a more appropriate solution than reduce: it more directly/simply does what is being asked for. – ToolmakerSteve Dec 19 '13 at 22:40
  • I take back what I said about chain being more appropriate. While I do find it easier to read/understand, it has a slight performance issue, if the ultimate goal is to make a list: it passes back elements one at a time, so the list cannot predict how long it will be. In that sense, it is more similar to list comprehension/append than to reduce/extend. With timeit, for 100 lists x 100 elements each, reduce/iadd vs chain roughly equal time (1.1). With 4x the data (200 lists x 200 elements each), reduce wins (2.6 vs 4.0). – ToolmakerSteve Dec 20 '13 at 00:17
  • 1
    WARNING: iadd MODIFIES the first list passed in. Doesn't matter in the example, because the lists are results from a function. But I did one test where I passed in a list of lists that I had pre-computed. Altered my original list, which was not good to do. FIX: instead of `reduce(iadd, LL)` or even `reduce(iadd, (L for L in LL))`, must wrap each returned L in list(): `reduce(iadd, (list(L) for L in LL))`. This forces each L to be copied. (Which is quick, because size is known.). – ToolmakerSteve Dec 20 '13 at 00:25
  • 1
    .. List comprehension degrades more quickly (2.4 => 9.1). Sum is WAY worse (13.8 => 130.2)! Repeating those numbers together for easier comparison: (reduce, chain, comprehension, sum) @ 100x100 = (1.1, 1.1, 2.6, 13.8); @ 200x200 = (2.6, 4.0, 9.1, 130.2). – ToolmakerSteve Dec 20 '13 at 00:45
  • 1
    Test code (python 2.7): `print timeit('all = reduce(operator.iadd, (list(list_) for list_ in LL))', number=1000, setup='n = 100; import operator; L1 = list(range(n)); LL = [[10 * x + v for v in L1] for x in range(n)]')` `print timeit('all = list(itertools.chain(*LL))', number=1000, setup='n = 100; L1 = list(range(n)); LL = [[10 * x + v for v in L1] for x in range(n)]')` `print timeit('all = [x for list_ in LL for x in list_]', number=...` `print timeit('all = sum(LL, [])', number=...` THEN repeat those 4, with `n = 200;` instead of `100`. (Then I multiplied the resulting times by 10) – ToolmakerSteve Dec 20 '13 at 00:56
  • @emu do you know why sum runs in quadratic time? That surprises me. – drevicko Jun 25 '14 at 17:16
  • @ToolmakerSteve you can add a second argument to `reduce` as an initial value, circumventing the need to copy the lists. – drevicko Jun 25 '14 at 18:12
  • 1
    @drevicko Because it has no choice but to construct a new list during each addition, and that is a linear-time operation. – emu Jun 25 '14 at 20:50
3

I like itertools.chain approach because it runs in linear time (sum(...) runs in qudratic time) but @Jochen didn't show how to deal with lists of dynamic length. Here is solution for OP's question.

import itertools
list(itertools.chain(*[getNeighbors(n) for n in getNeighbors(vertex)]))

You can get rid of list(...) call if iterable is sufficient for you.

renadeen
  • 1,741
  • 17
  • 16
  • 3
    You can also get rid of the unpacking `*[getNeighbors...]` by using `chain.from_iterable` like this: `list(itertools.chain.from_iterable(getNeighbors(n) for n in getNeighbors(vertex)))` – emu Jun 30 '16 at 10:34
  • Or you can keep the unpacking but not have it generate a list by doing `list(itertools.chain(*(getNeighbors(n) for n in getNeighbors(vertex))))` – Benjamin Atkin Oct 24 '22 at 01:08
0

Using .extend() (update in place) combined with reduce instead of sum() (new object each time) should be more efficient however I'm too lazy to test that :)

mylist = [[1,2], [3,4], [5,6]] 
reduce(lambda acc_l, sl: acc_l.extend(sl) or acc_l, mylist)
realmaniek
  • 465
  • 5
  • 11
  • It is indeed faster, but as [Yariv's answer](https://stackoverflow.com/a/33277438/160206) shows, it's not the fastest approach. – Björn Pollex Nov 06 '17 at 10:07