Reduce increments in array values to 1

Question

I am trying to create a list (say B) which increments only when there is a difference in values of another list (say A), for example:

[1,1,2,2,4,4] to [0,0,1,1,2,2] or

[1,1,1,1,4,4,4,4] to [0,0,0,0,1,1,1,1] etc.

The following code does it:

boxes=[1,1,1,1,4,4,4,4]
positions=[0]
position=0
for psn,box in list(enumerate(boxes))[:-1]:
    if boxes[psn+1]-box ==0:
        increment=0
    else:
        increment=1
    position=position+increment
    positions.append(position)
print(positions)

Can anybody give suggestions to do it using list comprehensions (preferable using lambda functions)?

Thank you so much, but the motive is to know how it can be done using lamda functions. — Ashok, Jan 09 '19 at 14:48
Actually, I misunderstood. `numpy.diff` is not what you're looking for. — pault, Jan 09 '19 at 14:50
It's not going to be easy (without side-effects or nested loops) to do this using a list comprehension. — pault, Jan 09 '19 at 14:58

score 6 · Answer 1 · answered Jan 09 '19 at 14:48

6

Use itertools.groupby:

from itertools import groupby

a = [1,1,2,2,4,4]

result = [i for i, (_, group) in enumerate(groupby(a)) for _ in group]
print(result)

Output

[0, 0, 1, 1, 2, 2]

answered Jan 09 '19 at 14:48

Dani Mesejo

61,499
6
49
76

yatu · Accepted Answer · 2019-01-09T14:59:06.223

Here's a way using nummpy:

a = [1,1,2,2,4,4]
[0] + np.cumsum(np.clip(np.diff(a), 0, 1)).tolist()
[0, 0, 1, 1, 2, 2]

Or for the other example:

a = [1,1,1,1,4,4,4,4]
[0] + np.cumsum(np.clip(np.diff(a), 0, 1)).tolist()
[0, 0, 0, 0, 1, 1, 1, 1]

Details

a = [1,1,2,2,4,4]

Get the first difference of the array with np.diff

np.diff(a)
array([0, 1, 0, 2, 0])

And use np.clip to limit the values between 0 and 1:

np.clip(np.diff(a), 0, 1)
array([0, 1, 0, 1, 0])

Finally take the np.cumsum and add a 0 at the beginning as the difference will give you an array of length n-1:

[0] + np.cumsum(np.clip(np.diff(a), 0, 1)).tolist()
[0, 0, 1, 1, 2, 2]

score 3 · Answer 3 · answered Jan 09 '19 at 14:53

I see numpy solutions, so here we go.

digitize

np.digitize(A, np.unique(A)) - 1
# array([0, 0, 0, 0, 1, 1, 1, 1])

factorize

import pandas 
pd.factorize(A)[0]
# array([0, 0, 0, 0, 1, 1, 1, 1])

groupby and ngroup

pd.Series(A).groupby(A).ngroup()

0    0
1    0
2    0
3    0
4    1
5    1
6    1
7    1
dtype: int64

unique

np.unique(A, return_inverse=True)[1]
# array([0, 0, 0, 0, 1, 1, 1, 1])

Using list comprehension with itertools.accumulate:

from itertools import accumulate
from operator import add

list(accumulate([0] + [x != y for x, y in zip(A, A[1:])], add))
# [0, 0, 0, 0, 1, 1, 1, 1]

score 2 · Answer 4 · answered Jan 09 '19 at 15:12

You can't do this with a traditional list comprehensions because they can't share a mutable state between iterations.

In this case, using itertools.groupby, numpy, or a plain python loop (as in your code) is recommended.

BUT if you really wanted to use a list comprehension, one way would be to rely side effects.

For example:

boxes=[1,1,1,1,4,4,4,4]
positions = [0]
throwaway = [
    positions.append(positions[-1] + 0 if boxes[psn+1]-box == 0 else 1) 
    for psn, box in enumerate(boxes[:-1])
]
print(positions)
#[0, 0, 0, 0, 1, 1, 1, 1]

You are using the list comprehension to create a list called throwaway, but the actual contents of throwaway are not useful at all. We use the iterations to call append on positions. Since append returns None, the following is the actual result of the list comprehension.

print(throwaway)
#[None, None, None, None, None, None, None]

However, relying on the side effects like this is not considered good practice.

Daweo · Answer 5 · 2019-01-09T15:32:56.663

Method using zip and list comprehension and slicing

a = [1,1,2,2,4,4]
increments = [bool(i[1]-i[0]) for i in zip(a,a[1:])]
b = [sum(increments[:i]) for i in range(len(increments)+1)]
print(b) #prints [0, 0, 1, 1, 2, 2]

Explanation: this solution, rely on that in Python:

any number other than 0 (or 0.0) is evaluated as True when feed to bool function

when such need arises True and False values are turned into 1 and 0 respectively

how sum function works: in reality something like sum([3,4]) means calculate 0+3+4 thus sum([True,True]) means calculate 0+True+True, which is translated into 0+1+1

Reduce increments in array values to 1

5 Answers5