Creating list of individual list items multiplied n times

Question

I'm fairly new to Python, and think this should be a fairly common problem, but can't find a solution. I've already looked at this page and found it helpful for one item, but I'm struggling to extend the example to multiple items without using a 'for' loop. I'm running this bit of code for 250 walkers through Emcee, so I'm looking for the fastest way possible.

I have a list of numbers, a = [x,y,z] that I want to repeat b = [1,2,3] times (for example), so I end up with a list of lists:

[
 [x],
 [y,y],
 [z,z,z]
]

The 'for' loop I have is:

c = [ ]
for i in range (0,len(a)):
    c.append([a[i]]*b[i])

Which does exactly what I want, but means my code is excruciatingly slow. I've also tried naively turning a and b into arrays and doing [a]*b in the hopes that it would multiply element by element, but no joy.

Ashwini Chaudhary · Accepted Answer · 2013-06-02T09:26:40.297

10

You can use zip and a list comprehension here:

>>> a = ['x','y','z']
>>> b = [1,2,3]
>>> [[x]*y for x,y in zip(a,b)]
[['x'], ['y', 'y'], ['z', 'z', 'z']]

or:

>>> [[x for _ in xrange(y)] for x,y in zip(a,b)]
[['x'], ['y', 'y'], ['z', 'z', 'z']]

zip will create the whole list in memory first, to get an iterator use itertools.izip

In case a contains mutable objects like lists or lists of lists, then you may have to use copy.deepcopy here because modifying one copy will change other copies as well.:

>>> from copy import deepcopy as dc
>>> a = [[1 ,4],[2, 5],[3, 6, 9]]
>>> f = [[dc(x) for _ in xrange(y)] for x,y in zip(a,b)]

#now all objects are unique
>>> [[id(z) for z in x] for x in f]
[[172880236], [172880268, 172880364], [172880332, 172880492, 172880428]]

timeit comparisons(ignoring imports):

>>> a = ['x','y','z']*10**4
>>> b = [100,200,300]*10**4

>>> %timeit [[x]*y for x,y in zip(a,b)]
1 loops, best of 3: 104 ms per loop

>>> %timeit [[x]*y for x,y in izip(a,b)]
1 loops, best of 3: 98.8 ms per loop

>>> %timeit map(lambda v: [v[0]]*v[1], zip(a,b))
1 loops, best of 3: 114 ms per loop

>>> %timeit map(list, map(repeat, a, b))
1 loops, best of 3: 192 ms per loop

>>> %timeit map(list, imap(repeat, a, b))
1 loops, best of 3: 211 ms per loop

>>> %timeit map(mul, [[x] for x in a], b)
1 loops, best of 3: 107 ms per loop

>>> %timeit [[x for _ in xrange(y)] for x,y in zip(a,b)]
1 loops, best of 3: 645 ms per loop

>>> %timeit [[x for _ in xrange(y)] for x,y in izip(a,b)]
1 loops, best of 3: 680 ms per loop

edited Jun 02 '13 at 09:26

answered Jun 02 '13 at 08:33

Ashwini Chaudhary

244,495
58
464
504

1

I'd suggest mentioning `itertools.izip`. – kirelagin Jun 02 '13 at 08:37
1

Thanks! Will the 'for' loops still slow the code down, though? – user2444731 Jun 02 '13 at 08:37
@user2444731 what do you mean “slow”? Why do you think that `for`-loop is slow? Slow compared to what? – kirelagin Jun 02 '13 at 08:37
3

Since user is new, it's probably wise to point out that [x]*y creates y references to x, which can be problematic in some cases. As if one is changed, it is possible for all of them to be changed. – Nuclearman Jun 02 '13 at 08:38
As Nuclearman says, I'm new - I was told that `for`-loops are intrinsically slow, and that they're best avoided for MCMC - still blindly following any advice I hear :s – user2444731 Jun 02 '13 at 08:43
1

You probably should have mentioned that this was for a Monte Carlo algorithm when you posed the question. Context is helpful. – Nuclearman Jun 02 '13 at 08:48
@user2444731 List comprehensions are slightly faster than normal for-loops, I don't think you can get any faster than this in pure python. – Ashwini Chaudhary Jun 02 '13 at 08:56
@Nuclearman Yes, but note that `[[x for _ in range(y)] for x,y in zip(a,b)]` is perfectly equivalent to the `[x] * y` variant(apart from being much slower, since `[x] * y` is an operation completely done at C level, while the list-comprehension is a python loop however optimized). @user2444731 Do not fall in premature optimization! Always write the code in the simplest and more readable way, then, *after profiling*, find the bottlenecks and optimize the code. – Bakuriu Jun 02 '13 at 09:41
@Bakuriu: True, I brought it up [x] * y since the issue can't be avoided in that case. – Nuclearman Jun 02 '13 at 09:49

score 4 · Answer 2 · answered Jun 02 '13 at 09:00

4

The fastest way to do it is with map() and operator.mul():

>>> from operator import mul
>>> map(mul, [['x'], ['y'], ['z']], [1, 2, 3])
[['x'], ['y', 'y'], ['z', 'z', 'z']]

answered Jun 02 '13 at 09:00

Raymond Hettinger

216,523
63
388
485

3

+1 I just figured this out when you posted it (the input is not a list of lists however) – jamylak Jun 02 '13 at 09:00

score 3 · Answer 3 · answered Jun 02 '13 at 11:52

3

>>> from itertools import repeat
>>> from itertools import starmap
>>> a = ['x','y','z']
>>> b = [1,2,3]
>>> starmap(repeat,zip(a,b))

starmap returns an iterable which contains values equal to the result of calling repeat with arguments equal to the values contained in a tuple, in this case for example ('x',1).

>>> for p in starmap(repeat,zip(a,b)):
    print(list(p))


['x']
['y', 'y']
['z', 'z', 'z']

answered Jun 02 '13 at 11:52

HennyH

7,794
2
29
39

note that you have to call `map(list, ...)` on the `starmap` to get the list of lists – jamylak Jun 02 '13 at 12:04

jamylak · Answer 4 · 2013-06-02T08:57:10.117

@kirelagin suggested a version without for loops, here's one that also doesn't have lambdas (Keep in mind the solution by @AshwiniChaudhary is most readable)

>>> from itertools import repeat
>>> a = ['x','y','z']
>>> b = [1,2,3]
>>> map(list, map(repeat, a, b))
[['x'], ['y', 'y'], ['z', 'z', 'z']]

>>> map(repeat, a, b)
[repeat('x', 1), repeat('y', 2), repeat('z', 3)]

creates a list of repeat objects (use imap on Python 2.x if you want a lazy iterator instead of a list) which don't take up any extra space in memory, these are great if you just want to iterate over the items instead of store them)

kirelagin · Answer 5 · 2013-06-02T08:48:14.997

Here is a version without for loops if you don't like them for some reason:

map(lambda v: [v[0]]*v[1], zip(a,b))

I should also warn you that this version is slightly slower than a list comprehension:

$ a = ['hi']*100
$ b = [20]*100

$ %timeit map(lambda v: [v[0]]*v[1], zip(a,b))
10000 loops, best of 3: 101 us per loop

%timeit [[x]*y for x,y in zip(a,b)]
10000 loops, best of 3: 74.1 us per loop

I'd also recommend using itertools.izip instead of zip if you are on Python 2.

Creating list of individual list items multiplied n times

5 Answers5