Track value changes in a repetitive list in Python

Question

I have a list with repeating values as shown below:

x = [1, 1, 1, 2, 2, 2, 1, 1, 1]

This list is generated from a pattern matching regular expression (not shown here). The list is guaranteed to have repeating values (many, many repeats - hundreds, if not thousands), and is never randomly arranged because that's what the regex is matching each time.

What I want is to track the list indices at which the entries change from the previous value. So for the above list x, I want to obtain a change-tracking list [3, 6] indicating that x[3] and x[6] are different from their previous entries in the list.

I managed to do this, but I was wondering if there was a cleaner way. Here's my code:

x = [1, 1, 1, 2, 2, 2, 1, 1, 1]

flag = []
for index, item in enumerate(x):
    if index != 0:
        if x[index] != x[index-1]:
            flag.append(index)

print flag

Output: [3, 6]

Question: Is there a cleaner way to do what I want, in fewer lines of code?

well looking at it you could get rid of `lag` by just using `index-1` in your second if statement, and change the second if to `!=` and that way you can drop the else and move that code up to the if — James Kent, Jan 30 '15 at 18:11
@JamesKent That's a good idea. I updated the question and the code. Thanks. — prrao, Jan 30 '15 at 18:17
You already have `item`, so you don't need to access `x[index]` again for the comparison to `x[index-1]` — Joel Cornett, Jan 30 '15 at 18:21
possible duplicate of [find length of sequences of identical values in a numpy array](http://stackoverflow.com/questions/1066758/find-length-of-sequences-of-identical-values-in-a-numpy-array) — Joe, Feb 10 '15 at 13:28

Bhargav Rao · Accepted Answer · 2015-01-30T18:19:03.907

9

It can be done using a list comprehension, with a range function

>>> x = [1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ]
[3, 6]
>>> x = [1, 1, 1, 2, 2, 2, 1, 1, 1]
>>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ]
[3, 6]

edited Jan 30 '15 at 18:19

answered Jan 30 '15 at 18:10

Bhargav Rao

50,140
28
121
140

Ashwini Chaudhary · Answer 2 · 2015-01-30T18:22:32.967

You can do something like this using itertools.izip, itertools.tee and a list-comprehension:

from itertools import izip, tee
it1, it2 = tee(x)
next(it2)
print [i for i, (a, b) in enumerate(izip(it1, it2), 1) if a != b]
# [3, 6]

Another alternative using itertools.groupby on enumerate(x). groupby groups similar items together, so all we need is the index of first item of each group except the first one:

from itertools import groupby
from operator import itemgetter
it = (next(g)[0] for k, g in groupby(enumerate(x), itemgetter(1)))
next(it) # drop the first group
print list(it)
# [3, 6]

If NumPy is an option:

>>> import numpy as np
>>> np.where(np.diff(x) != 0)[0] + 1
array([3, 6])

I was thinking `list(accumulate(len(list(g)) for k,g in groupby(x)))[:-1]` before I came to my senses.. — DSM, Jan 30 '15 at 18:29

Mazdak · Answer 3 · 2015-01-30T18:36:46.150

2

instead multi-indexing that has O(n) complexity you can use an iterator to check for the next element in list :

>>> x =[1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> i_x=iter(x[1:])
>>> [i for i,j in enumerate(x[:-1],1) if j!=next(i_x)]
[3, 6]

edited Jan 30 '15 at 18:36

answered Jan 30 '15 at 18:10

Mazdak

105,000
18
159
188

4

This is quadratic runtime, and it does not handle the case `[1, 1, 1, 2, 2, 2, 1, 1, 1]` correctly. – Sven Marnach Jan 30 '15 at 18:11
@SvenMarnach +1, I was against the use of `set` for this very reason. – prrao Jan 30 '15 at 18:20

score 2 · Answer 4 · answered Jan 30 '15 at 18:16

2

I'm here to add the obligatory answer that contains a list comprehension.

flag = [i+1 for i, value in enumerate(x[1:]) if (x[i] != value)]

answered Jan 30 '15 at 18:16

Roberto

2,696
18
31

score 1 · Answer 5 · answered Jan 30 '15 at 18:23

itertools.izip_longest is what you are looking for:

from itertools import islice, izip_longest

flag = []
leader, trailer = islice(iter(x), 1), iter(x)
for i, (current, previous) in enumerate(izip_longest(leader, trailer)):
    # Skip comparing the last entry to nothing
    # If None is a valid value use a different sentinel for izip_longest
    if leader is None:
        continue
    if current != previous:
        flag.append(i)

Track value changes in a repetitive list in Python

5 Answers5

Linked

Related