python flatten an array of array, why funtools is slower?

Question

based on this question python flatten an array of array

I want a faster way than the double loop solution. so I write a functools based function,but it seems much slower.

orders2.shape
(9966, 1)

import time

t0 = time.time()
[x for i in orders2.values.tolist() for x in i[0].tolist()]
t1 = time.time()

t1-t0
0.009984493255615234
import time

t0 = time.time()
functools.reduce(lambda a,b: a+b[0].tolist() , orders2.values.tolist(), [])
t1 = time.time()

t1-t0
1.4101920127868652

My question is 1. how could this happen? 2. will the functools algorithm be faster than double loop when using the big data? 3.any other algorithms that fast than double loop?

Your loop vs the reduce code are doing completely different things — Jab, Jul 08 '19 at 03:56
That is basically because your approach with the `reduce` and `+` is very far from optimal. Try `itertools.chain.from_iterable()` instead. — Klaus D., Jul 08 '19 at 03:57
The original is a numpy array, or rather, a `pandas` dataframe. So `orders2.values` is a numpy array. right? — hpaulj, Jul 08 '19 at 05:05
The `np.concatenate` that I suggest in the previous question is pretty fast. — hpaulj, Jul 08 '19 at 05:18
There's no such thing as 'just array' in python. Did you check the `type` — hpaulj, Jul 08 '19 at 15:50

Jab · Answer 1 · 2019-07-08T04:14:18.577

This is closely related to the answers given to list comprehension vs map as you're using a lambda with your reduce statement you're sending python code to be ran for each iteration thus slowing the reduce down. List comprehensions were meant to be much more efficient and readable thus they are the prefered method of choice.

That said why not use itertools.chain.from_iterable as well as mapping operator.itemgetter. This results in the same output while also utilizing some great builtin methods. Haven't tested for speed

>>> from itertools import chain
>>> from operator import itemgetter
>>> arr = array([[array([33120, 28985,  9327, 45918, 30035, 17794, 40141,  1819, 43668],
      dtype='int64')],
       [array([33754, 24838, 17704, 21903, 17668, 46667, 17461, 32665],
      dtype='int64')],
       [array([46842, 26434, 39758, 27761, 10054, 21351, 22598, 34862, 40285,
       17616, 25146, 32645, 41276], dtype='int64')],
       [array([24534,  8230, 14267,  9352,  3543, 29397,   900, 32398, 34262,
       37646, 11930, 37173], dtype='int64')],
       [array([25157], dtype='int64')],
       [array([ 8859, 20850, 19322,  8075], dtype='int64')]], dtype=object)
>>> array(list(chain.from_iterable(map(itemgetter(0),arr.tolist()))))
[33120 28985  9327 45918 30035 17794 40141  1819 43668 33754 24838 17704
 21903 17668 46667 17461 32665 46842 26434 39758 27761 10054 21351 22598
 34862 40285 17616 25146 32645 41276 24534  8230 14267  9352  3543 29397
 900 32398 34262 37646 11930 37173 25157  8859 20850 19322  8075]

score 2 · Accepted Answer · answered Jul 08 '19 at 04:07

In short, function calling and list re-allocation overheads apart, your algorithm with the nested loops is O(N) and the one using reduce is O(N²).

Even if the algorithms would not be different, the idea that calling a function has "0" cost comes from mathematics, were functions are nice theoretical constructs.

When running in a computer program, calling a function requires a context to be initialised - in case of Python, the creation of a Frame object, with local variables. As you have arguments being passed around, it implies in a tuple with the arguments being constructed prior to the function call, and being de-constructed in the function body (although these steps might be optimized by the implementation).

While in the 2-nested loops approach, all you have to do is to iterate an iterator in native code - although theoretically, per Python's specs, that would also imply calling a function (the object's __iter__ method), it is usually much faster in real implementations of native code iterators.

That however would no account for the difference you are seeing there. The main problem is that for each iteraction, when performing a + b[0].tolist() a new list "c" is created in memory, the values of "a" are copied there, then the values from b[0] are appended to it. And this new list + copy of the elements already flattened will take place in each step. In the list-comphrehension case, there is no redundant copy taking place - a new element is placed as they come up from the unrolling of the parent 2D structure, and Python is well optimised to pre-alocate space for lists that grow up when been built, in this form.

score 0 · Answer 3 · answered Jul 08 '19 at 03:57

I think there are at least two problems:

with the first one, you are creating a list and appending elements in it. But with the second one, you are keeping concatenating two lists by a+b[0].tolist(), which results in a new list.
functools.reduce return a generator, that is the main purpose. In short, it is not for speed.

python flatten an array of array, why funtools is slower?

3 Answers3