2

I have variable lists filled with 0s and 1s, like:

l1 = [1,1,1,0,1,1]
l2 = [0,1,1,0,1,1,0,0,1]

what is the most efficient way to create new lists that squash all consecutive 1s.

So the result would be:

l1_new = [1,0,1]
l2_new = [0,1,0,1,0,0,1]

Hint: numpy/vectorization or some logical operation would be great!

gustavz
  • 2,964
  • 3
  • 25
  • 47

5 Answers5

3

To leverage NumPy, we would need array, so upon that conversion, we will create an appropriate mask and index -

def squash1s(a):
    a = np.asarray(a)
    m = a==1
    return a[np.r_[True,m[:-1]!=m[1:]] | (a==0)]

Sample runs -

In [64]: l1 = [1,1,1,0,1,1]
    ...: l2 = [0,1,1,0,1,1,0,0,1]

In [65]: squash1s(l1)
Out[65]: array([1, 0, 1])

In [66]: squash1s(l2)
Out[66]: array([0, 1, 0, 1, 0, 0, 1])

Benchmarking

Since we are concerned with performance efficiency, let's benchmark the NumPy ones, as they should be pretty efficient.

Other proposed solutions

# @yatu's solution
def yatu_diff(l):
    a = np.asarray(l)
    return a[~((np.diff(a,prepend=False)==0) & a==1)]

# PaulPanzer's suggestion/soln
def pp_concat(a):
    a = np.asarray(a)
    return a.repeat(1-(a*np.concatenate([[0],a[:-1]])))

Using benchit package (few benchmarking tools packaged together; disclaimer: I am its author) to benchmark proposed solutions.

import benchit

funcs = [squash1s, yatu_diff, pp_concat]

# With ~30% 0s
in_ = [(np.random.rand(n)>0.3).astype(int) for n in 10**np.arange(4)]
t = benchit.timings(funcs, in_)
t.plot(logx=True, save='timings30.png')

# With ~70% 0s
in_ = [(np.random.rand(n)>0.7).astype(int) for n in 10**np.arange(4)]
t = benchit.timings(funcs, in_)
t.plot(logx=True, save='timings70.png')

With ~30% 0s :

enter image description here

With ~70% 0s :

enter image description here

Divakar
  • 218,885
  • 19
  • 262
  • 358
  • More and more benchmarking tools to choose from :-) Out of curiosity, could you include `a.repeat(1-(a*np.concatenate([[0],a[:-1]])))`? – Paul Panzer Jun 18 '20 at 11:50
  • @PaulPanzer Nice! You gotta post that as a solution. And you gotta try out `benchit` :) Would love some feedback/possible improvements! – Divakar Jun 18 '20 at 11:56
  • Thanks! Right now I've no time, but will do later. Btw. I suspect mine will scale less good for larger n. – Paul Panzer Jun 18 '20 at 11:59
3

Here's one approach using with np.diff and bitwise operations:

l1 = [1,1,1,0,1,1]
l2 = [0,1,1,0,1,1,0,0,1]

a = np.array(l2)
a[~((np.diff(a,prepend=False)==0) & (a==1))]
# array([0, 1, 0, 1, 0, 0, 1])

Or for the first example:

a = np.array(l1)
a[~((np.diff(a,prepend=False)==0) & (a==1))]
#array([1, 0, 1])
yatu
  • 86,083
  • 12
  • 84
  • 139
  • perfect, thank you. i have a similar (slighty different) question here https://stackoverflow.com/questions/62447168/manipulate-python-list-depending-on-other-lists-elements/62448122#62448122 maybe you can anwser that aswell – gustavz Jun 18 '20 at 11:28
  • yw @gustav - will take a look, don't have much time now :) – yatu Jun 18 '20 at 11:29
1

Given

lst_1 = [1, 1, 1, 0, 1, 1]
lst_2 = [0, 1, 1, 0, 1, 1, 0, 0, 1]
lst_3 = [0, 1, 1, 0, 1, 1, 0, 0, 1, 1]

Code

def compress_values(seq, value=1):
    """Yield a value in isolation."""

    for here, nxt in zip(seq, seq[1:]):

        if here == nxt == value:
            continue
        else:
            yield here

    yield nxt        

Demo

assert [1, 0, 1] == list(compress_values(lst_1))
assert [0, 1, 0, 1, 0, 0, 1] == list(compress_values(lst_2))
assert [0, 1, 0, 1, 0, 0, 1] == list(compress_values(lst_3))

Details

Slide a two-tuple window. If values are equal to each other and the target value, skip. Otherwise yield values.

An alternative, more general approach:


import itertools as it


def squash(seq, values=(1,)):
    """Yield singular values in isolation."""

    for k, g in it.groupby(seq):

        if k in values:
            yield k
        else:
            yield from g

pylang
  • 40,867
  • 14
  • 129
  • 121
-1

Here is a possible solution

arr = [0,1,1,0,1,1,0,0,1]

previous_value = None
new_lst = []

for elem in arr:
   if elem != previous_value:
       new_lst.append(elem)
       previous_value = elem

print(new_lst)

Where arr is any list you want, and it will work with anything including strings.

Anik Patel
  • 143
  • 4
-1

Using a while loop,

l1 = [1,1,1,0,1,1]
i = 0

while i < len(l1)-1:
    if l1[i] == l1[i+1]:
        del l1[i]
    else:
        i = i+1
print(l1)

another one that i found on geekforgeeks is ,

from itertools import zip_longest 


test_list = [1, 4, 4, 4, 5, 6, 7, 4, 3, 3, 9] 

res = [i for i, j in zip_longest(test_list, test_list[1:]) 
                                                if i != j] 
Prathamesh
  • 1,064
  • 1
  • 6
  • 16