How to use reduce with list of lists

Question

What is wrong with the following?:

lss = reduce(list.extend, [[1],[2]], [])

causes:

Traceback (most recent call last):
  File "<pyshell#230>", line 1, in <module>
    lss = reduce(list.extend, [[1],[2]], [])
TypeError: descriptor 'extend' requires a 'list' object but received a 'NoneType'

I'm not sure where the NoneType is coming from.

What is your desired result?/ What list are you trying to extend? — Matthew Adams, Jul 04 '12 at 15:32

Óscar López · Accepted Answer · 2012-07-04T15:35:13.463

14

Try this instead:

lss = reduce(lambda acc, ele : acc + ele, [[1],[2]], [])

lss
> [1, 2]

The problem is that extend() returns None (that's where the NoneType is coming from), and that won't work with what you want to do - the function passed to reduce() must return a value: the accumulated result so far.

edited Jul 04 '12 at 15:35

answered Jul 04 '12 at 15:30

Óscar López

232,561
37
312
386

I realise I can do that. I would still like to know what is wrong with using the list.extend method. It seems like it should work. - your edit answered this. – cammil Jul 04 '12 at 15:31
1

@cammil as I mentioned above: `extend()` will _always_ return `None`; for `reduce()` to work the function that gets executed _has to_ return a value: the accumulated result so far. – Óscar López Jul 04 '12 at 15:33
2

In case anyone is wondering *why* `list.extend()` returns `None`: it's because in Python, functions that mutate a data structure always return `None`, because Python is abiding by Command/Query Separation. http://en.wikipedia.org/wiki/Command-query_separation – steveha Jul 04 '12 at 20:41

Winston Ewert · Answer 2 · 2012-07-07T19:50:49.190

8

I think it worth noting that:

sum([[1],[2]], [])

Will also work, and I'm pretty sure will be faster then passing a lambda to reduce.

I was curious as to the speed of different methods, so I did some testing:

reduce(lambda a,b:a+b, x, [])               3644.38161492
reduce(list.__add__, x, [])                 3609.44079709
sum(x,[])                                   3526.84987307
y = [];for z in x: y.extend(z)               143.370306969
y = [];map(y.extend,x)                        71.7020270824
y = [None]*400;del y[:];map(y.extend,x)       66.2245891094
list(itertools.chain(*x))                    102.285979986
list(itertools.chain.from_iterable(x))        96.6231369972
[a for b in x for a in b]                    203.764872074

And on PyPy (Because, why not)

reduce(lambda a,b:a+b, x, [])               4797.5895648
reduce(list.__add__, x, [])                 4794.01214004
sum(x,[])                                   4748.02929902
y = [];for z in x: y.extend(z)                56.9253079891
y = [];map(y.extend,x)                        73.8642170429
y = [None]*400;del y[:];map(y.extend,x)      152.157783031
list(itertools.chain(*x))                    633.854824066
list(itertools.chain.from_iterable(x))       629.917827129
[a for b in x for a in b]                     89.6922459602

x = [[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7],[5,6,7,8],[6,7,8,9],[7,8,9,10],[8,9,10,11]]*100

Conclusions:

Using a lambda in your reduce is slow
The specialized sum function is faster then reduce
Adding lists is slow.
Python loop overhead is significant.

edited Jul 07 '12 at 19:50

answered Jul 04 '12 at 15:35

Winston Ewert

44,070
10
68
83

If you set x = x*100 you can start to see the real speed differences between the algorithms. – DSM Jul 04 '12 at 16:10
Unfortunately, the [two ways of joining a list of lists that are usually recommended](http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python) are lacking from this benchmark: `list(itertools.chain.from_iterable(x))` and `[b for a in x for b in a]`. The first one is probably the fastest option. – Sven Marnach Jul 05 '12 at 14:33
@SvenMarnach, added them. However, my results have them both being slower than some of the existing options. – Winston Ewert Jul 05 '12 at 15:05
On two machines I tested, `list(itertools.chain.from_iterable(x))` is consistently faster than `list(itertools.chain(*x))`, and by quite a margin so for bigger lists that don't have the artificially high cache locality of your example list (caused by `* 100`, which reuses the same lists hundred times). It is indeed slower than `y = []; map(y.extend, x)`, though. Interesting post! – Sven Marnach Jul 06 '12 at 11:44
@SvenMarnach, I went ahead and reran all the tests, my results now fit your experiences. – Winston Ewert Jul 07 '12 at 14:41
@WinstonEwert: I like the new option you added – I certainly wouldn't have expected this result. I would only use it if I really need the extra bit of performance, but it is an interesting option to be aware of – the Python list analogue of `std::vector::reserve()`. – Sven Marnach Jul 07 '12 at 14:53

score 1 · Answer 3 · answered Jul 04 '12 at 15:53

1

As noted by Óscar López, list.extend returns None and thus cannot be used with reduce. Alternatively to the suggested use of the lambda function, you could also use list.__add__ with reduce:

>>> reduce(list.__add__, [[1],[2]], [])
[1, 2]

answered Jul 04 '12 at 15:53

silvado

17,202
2
30
46

score 1 · Answer 4 · answered Jul 04 '12 at 16:06

You may want to use itertools.chain

Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence. Equivalent to:

def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

How to use reduce with list of lists

4 Answers4