Is it pythonic to use sum()
for list concatenation?
>>> sum(([n]*n for n in range(1,5)),[])
[1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
Is it pythonic to use sum()
for list concatenation?
>>> sum(([n]*n for n in range(1,5)),[])
[1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
No it's not, Actually it's shlemiel the painter algorithm. Because each time it wants to concatenate a new list it has to traverse the whole list from beginning. (For more info read this article by Joel: http://www.joelonsoftware.com/articles/fog0000000319.html)
The most pythonic way is using a list comprehension:
In [28]: [t for n in range(1,5) for t in [n]*n ]
Out[28]: [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
Or itertools.chain
:
In [29]: from itertools import chain
In [32]: list(chain.from_iterable([n]*n for n in range(1,5)))
Out[32]: [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
Or as a pure generator based approach you can use repeat
instead of multiplying the list:
In [33]: from itertools import chain, repeat
# In python2.X use xrange instead of range
In [35]: list(chain.from_iterable(repeat(n, n) for n in range(1,5)))
Out[35]: [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
Or if you are interested in numpy, or you want a super fast approach here is one:
In [46]: import numpy as np
In [46]: np.repeat(np.arange(1, 5), np.arange(1, 5))
Out[46]: array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])
No, this will get very slow for large lists. List comprehensions are a far better option.
Code for timing list flattening via list comprehensions, summation and itertools.chain.from_iterable:
import time
from itertools import chain
def increasing_count_lists(upper):
yield from ([n]*n for n in range(1,upper))
def printtime(func):
def clocked_func(*args):
t0 = time.perf_counter()
result = func(*args)
elapsed_s = time.perf_counter() - t0
print('{:.4}ms'.format(elapsed_s*1000))
return result
return clocked_func
@printtime
def concat_list_sum(upper):
return sum(increasing_count_lists(upper), [])
@printtime
def concat_list_listcomp(upper):
return [item for sublist in increasing_count_lists(upper) for item in sublist]
@printtime
def concat_list_chain(upper):
return list(chain.from_iterable(increasing_count_lists(upper)))
And running them:
>>> _ = concat_list_sum(5)
0.03351ms
>>> _ = concat_list_listcomp(5)
0.03034ms
>>> _ = concat_list_chain(5)
0.02717ms
>>> _ = concat_list_sum(50)
0.2373ms
>>> _ = concat_list_listcomp(50)
0.2169ms
>>> _ = concat_list_chain(50)
0.1467ms
>>> _ = concat_list_sum(500)
167.6ms
>>> _ = concat_list_listcomp(500)
8.319ms
>>> _ = concat_list_chain(500)
12.02ms