I have this list:
a=[7086, 4914, 1321, 1887, 7060]
. Now, I want to create duplicate of each value n-times. Such as:
n=2
a=[7086,7086,4914,4914,1321,1321,7060,7060]
How would I do this best? I tried a loop but it was inefficient.
I have this list:
a=[7086, 4914, 1321, 1887, 7060]
. Now, I want to create duplicate of each value n-times. Such as:
n=2
a=[7086,7086,4914,4914,1321,1321,7060,7060]
How would I do this best? I tried a loop but it was inefficient.
The idiomatic way in Python is to use zip and then flatten the resulting list of tuples.
As stated in the docs:
The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using zip(*[iter(s)]*n).
So in this case:
>>> a=[7086, 4914, 1321, 1887, 7060]
>>> n=2
>>> [e for t in zip(*[a]*n) for e in t]
[7086, 7086, 4914, 4914, 1321, 1321, 1887, 1887, 7060, 7060]
Alternatively, just flatten a constructed list of sublists made [list_item]*n
like so:
>>> [e for sl in ([e]*n for e in a) for e in sl]
# same output
Which can then be constructed as a generator which is as efficient as possible here:
>>> it=(e for sl in ([e]*n for e in a) for e in sl)
>>> next(it)
7086
>>> next(it)
7086
>>> next(it)
4914
>>> ...
I suspewct the most efficient way is to pre-allocate the result list, then fill it in with a loop. This way you don't have to allocate lots of temporary lists and repeatedly grow the result list. These memory allocations are likely to be the most expensive part of the code.
result = [0] * (len(a) * n)
for i, x in enumerate(a):
for j in range(i*n, i*n+n):
result[j] = x
%%timeit
b=[]
for x in a:
b += [x]*n
with %%timeit, 994 ns ± 8.22 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each).
Thanks to @Barmar for the +=
recommentdation!