4

Running

L = [1,2,3,4,5,6]
print zip(L,L[1:])[::2]

yields

[(1, 2), (3, 4), (5, 6)]

What zip (or other) statement would yield instead

[1, 2, None, 3, 4, None, 5, 6, None]

?

Update

It's quite all right to start with

L = [(1,2),(3,4),(5,6)]

so long as the statement remains a (fast) one-liner.

Update2

One use case of inserting None is to plot segments quickly.

Community
  • 1
  • 1
Calaf
  • 10,113
  • 15
  • 57
  • 120

6 Answers6

7

You can do something like this:

>>> L = [1,2,3,4,5,6]
>>> it = zip(*[iter(L)] * 2)
>>> [y for x in it for y in x + (None,)]
[1, 2, None, 3, 4, None, 5, 6, None]

Performance and space complexity wise @mgilson's approach if modified slightly is the best one of the lot:

>>> from itertools import izip, chain
>>> L = [1,2,3,4,5,6]*10**5
>>> %timeit [y for x in zip(*[iter(L)] * 2) for y in x + (None, )]
10 loops, best of 3: 47.2 ms per loop

If we remove the list-comprehension and use itertools.chain.from_iterable then you can see there's a significant improvement:

>>> %timeit list(chain.from_iterable(x + (None,) for x in izip(*[iter(L)] * 2)))
10 loops, best of 3: 31.8 ms per loop
>>> %timeit list(insert_none_while(L)) # mgilson's approach
10 loops, best of 3: 50.7 ms per loop
>>> %timeit list(insert_none_for(L))
10 loops, best of 3: 32.6 ms per loop

Here insert_none_while is @mgilson's original code and insert_none_for is:

def insert_none_for(iterable):
    it = iter(iterable)
    for x in it:
        yield x
        yield next(it)
        yield None

Update

A slightly modified version of @Padraic Cunningham's proposed solution seems to be the fastest(only by a slight margin compared to @Jochen Ritzel solution when used with itertools.izip):

>>> L = [1,2,3,4,5,6]*10**6
>>> %timeit [y for x in zip(*[iter(L)] * 2) for y in x + (None, )]
1 loops, best of 3: 541 ms per loop
>>> %timeit list(chain.from_iterable(x + (None,) for x in izip(*[iter(L)] * 2)))
1 loops, best of 3: 349 ms per loop
# Using while 1 and cached next function
>>> %timeit list(insert_none_while_one(L))
1 loops, best of 3: 470 ms per loop
# Cached next function
>>> %timeit list(insert_none_for(L))
1 loops, best of 3: 351 ms per loop
# Jochen Ritzel's original solutions
>>> %timeit it = iter(L); list(itertools.chain.from_iterable(zip(it, it, repeat(None))))
1 loops, best of 3: 352 ms per loop
# Jochen Ritzel's solutions using izip
>>> %timeit it = iter(L); list(itertools.chain.from_iterable(izip(it, it, repeat(None))))
10 loops, best of 3: 167 ms per loop
# Padraic Cunningham's solution using slicing
>>> %timeit list(chain.from_iterable(izip_longest(L[::2],L[1::2],[None])))
1 loops, best of 3: 236 ms per loop
# Padraic Cunningham's solution using iter 
>>> %timeit it=iter(L); list(chain.from_iterable(izip_longest(it, it, [])))
10 loops, best of 3: 156 ms per loop
# Kasra
>>> %timeit list(chain(*[L[i:i+2]+[None] for i in range(0,len(L),2)]))
1 loops, best of 3: 1.43 s per loop

Still not good enough?

Consider using NumPy arrays:

>>> arr = np.array(L, dtype=float)
>>> arr.size
6000000
>>> %timeit np.insert(arr.reshape(-1, 2), 2, None, axis=1).ravel()
10 loops, best of 3: 80.8 ms per loop

Related: How does zip(*[iter(s)]*n) work in Python?

Community
  • 1
  • 1
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • wonderful.. just two questions. Could you add a brief explanation? And how fast will each line be if L is huge? – Calaf Jan 22 '15 at 19:28
  • 1
    Was *just* going to post this... stop beating me by a few seconds everywhere please Ashwini :) – Jon Clements Jan 22 '15 at 19:29
  • @JonClements -- By the timestamps, it looks like 3 minutes ;-)... FWIW, I thought about posting something similar. The `zip(*[iter(..)])` is well enough known, but pairing it with a nested comprehension... I dunno. Seems a bit much :-). – mgilson Jan 22 '15 at 19:31
  • Hmm ... Interesting that a `for` should perform that much better than a `while True`. I suppose there's a whole bunch of extra conditional checks for the `while True` loop, though it seems that should be a common case that python should be able to optimize for (e.g. skipping the check). – mgilson Jan 22 '15 at 19:55
  • @mgilson One issue is `True` is looked up globally each time in Python 2, `while 1` may speed it a little bit. Plus an extra call to `next()` and `POP_JUMP_IF_FALSE` each time in the loop. – Ashwini Chaudhary Jan 22 '15 at 20:00
  • Right, I suppose that I'm thinking that the POP_JUMP_IF_FALSE operation can be optimized away (but you're right -- It can only be optimized away in python3.x) I suppose the extra global lookup of `next` is problematic as well. I suppose that can easily enough be mitigated by the good ole `def insert_none(iterable, next=next):` trick if you really wanted to... pair that with the `while 1` suggestion and you might start getting closer to the performance of `insert_none_for`... – mgilson Jan 22 '15 at 20:04
  • `for y in it: yield y break` is for me marginally faster even than cached `next`. – zch Jan 22 '15 at 20:10
  • how does `list(chain.from_iterable(izip_longest(L[::2],L[1::2],[None])))` go? – Padraic Cunningham Jan 22 '15 at 20:16
  • @PadraicCunningham, for me it's faster than everything proposed before, but `list(chain.from_iterable(izip_longest(*[iter(L)]*2 + [[]])))` is a little bit faster still. – zch Jan 22 '15 at 20:22
  • Cool, thanks. `list(chain.from_iterable(izip_longest(*[iter(L)]*2 + [[]])))` is a bit faster for me too `28.8ms` vs `26.3ms`. – Padraic Cunningham Jan 22 '15 at 20:25
  • I'm check-marking your answer because you included some time results. That said, someone, perhaps myself, should run an exhaustive time comparison and check-mark the appropriate answer. – Calaf Jan 23 '15 at 23:37
5

A simple generator'll do:

>>> def insert_none(iterable):
...   itr = iter(iterable)
...   while True:
...     yield next(itr)
...     yield next(itr)
...     yield None
... 
>>> list(insert_none([1, 2, 3, 4, 5, 6]))
[1, 2, None, 3, 4, None, 5, 6, None]
>>> list(insert_none([1, 2, 3, 4, 5]))
[1, 2, None, 3, 4, None, 5]
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Are you sure this is fast enough if L is huge? – Calaf Jan 22 '15 at 19:29
  • 2
    This is likely the _best_ answer if `L` is huge. It doesn't create any intermediate lists like you do when you need to make slices to pass to `zip`. – mgilson Jan 22 '15 at 19:30
5

zip takes as many args as you like. itertools.repeat(None) gives you a infinite amount of nothing:

import itertools

L = [1,2,3,4,5,6]
it = iter(L)
nons = itertools.repeat(None)

pairs = zip(it,it,nons)

The other start is simple:

L = [(1,2),(3,4),(5,6)]
pairs = [(a,b,None) for a,b in L]

To flatten the list of tuples:

flat = itertools.chain.from_iterable(pairs)
Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
  • You might as well finish it off -- OP wants it flat. `itertools.chain.from_iterable(zip(it, it, nons))` :-) – mgilson Jan 22 '15 at 19:35
  • But print [(a,b,None) for a,b in L] produces [(1, 2, None), (3, 4, None), (5, 6, None)]. – Calaf Jan 22 '15 at 19:37
  • @mgilson: for `L = [1,2,3,4,5]`, `list(chain.from_iterable(izip(it, it, repeat(None, len(L)))))` yields `[1, 2, None, 3, 4, None]` -- so the 5 is missing :/ – Dr. Jan-Philip Gehrcke Jan 22 '15 at 19:37
  • zip(it,it,nons) does it nicely. @mgilson: even nicer, though now I need an explanation :( – Calaf Jan 22 '15 at 19:38
  • 1
    @Jan-PhilipGehrcke -- That brings me to my comment on the question. What happens if the length isn't divisible by 2? I'm pretty sure that all the zip-based answers cut off a value which is why I used a generator based approach :-) – mgilson Jan 22 '15 at 19:39
3

A not-so-serious attempt at winning the code golf at this task, without any extra imports. Works similarly on Python 2 and 3. Disclaimer: this most probably ain't the fastest one :)

L = [1,2,3,4,5,6]
R = list(sum(zip(*[iter(L)]*2+[iter([].sort,0)]),()))
print(R)

Edit: actually this is shorter, though not as kludgey:

R = list(sum(zip(*[iter(L)]*2+[[None]*len(L)]),()))

Prints:

[1, 2, None, 3, 4, None, 5, 6, None]

Another fancy one using list slicing

L = [1,2,3,4,5,6]
R = [None] * (len(L) * 3 // 2)
R[::3] = L[::2]
R[1::3] = L[1::2]
print(R)

Or just insert Nones:

L = [1,2,3,4,5,6]
[ L.insert(i, None) for i in range(2, len(L) * 3 // 2, 3) ]
print(L)
2
out = []
for x in xrange(0,len(L)-1,2):
    out += L[x:x+2] + [None]
[1, 2, None, 3, 4, None, 5, 6, None]



from itertools import chain,izip
L = [1,2,3,4,5,6]

print(list(chain.from_iterable((x + (None,) for x in izip(L[::2],L[1::2])))))
[1, 2, None, 3, 4, None, 5, 6, None]

You can use izip_longest which will fill the missing values with None, you can iterate over without calling list if the list is very large and avoid reading all into memory at once:

from itertools import izip_longest
print(list(chain.from_iterable(izip_longest(L[::2],L[1::2],[None]))))
[1, 2, None, 3, 4, None, 5, 6, None]

As @ashwini pointed out combining with iter it becomes even more efficient:

it=iter(L)
list(chain.from_iterable(izip_longest(it, it, [])))
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • If L is large, then the speed of iterating in a loop is bound by that of the interpreter, hence this won't be adequate. Is that not right? – Calaf Jan 22 '15 at 19:33
  • @calaf, The itertools solution should be efficient, what happens with an uneven length list? – Padraic Cunningham Jan 22 '15 at 19:44
  • +1 But, slicing is expensive. We should replace it with iterators: `it=iter(L);list(chain.from_iterable(izip_longest(it, it, [])))`. Note that default fillvalue is already `None` so an empty list as third argument should also do it. 12 ms on my system, we have a winner. ;-) – Ashwini Chaudhary Jan 22 '15 at 20:30
  • @AshwiniChaudhary, I just put None in as I thought it made it a little more obvious, the `iter(L)` is neat but I don't get much performance gain on my system? – Padraic Cunningham Jan 22 '15 at 20:34
  • Yes, timing wise the difference is not going to be much(especially for small to mid-sized lists) but using slicing we are creating two extra lists in memory. So, using `iter()` saves both time and memory. For even bigger lists the difference is clearly visible, for `len(L)` = 6000000 the difference is 233 ms vs 156 ms. – Ashwini Chaudhary Jan 22 '15 at 20:38
  • @AshwiniChaudhary, yes of course, 267 vs 305 on my machine for L = 6000000. Cheers. – Padraic Cunningham Jan 22 '15 at 20:44
1

As an alternative , just with chain :

>>> list(chain(*[L[i:i+2]+[None] for i in range(0,len(L),2)]))
[1, 2, None, 3, 4, None, 5, 6, None]
Mazdak
  • 105,000
  • 18
  • 159
  • 188