6

I have a list with repeating values as shown below:

x = [1, 1, 1, 2, 2, 2, 1, 1, 1]

This list is generated from a pattern matching regular expression (not shown here). The list is guaranteed to have repeating values (many, many repeats - hundreds, if not thousands), and is never randomly arranged because that's what the regex is matching each time.

What I want is to track the list indices at which the entries change from the previous value. So for the above list x, I want to obtain a change-tracking list [3, 6] indicating that x[3] and x[6] are different from their previous entries in the list.

I managed to do this, but I was wondering if there was a cleaner way. Here's my code:

x = [1, 1, 1, 2, 2, 2, 1, 1, 1]

flag = []
for index, item in enumerate(x):
    if index != 0:
        if x[index] != x[index-1]:
            flag.append(index)

print flag

Output: [3, 6]

Question: Is there a cleaner way to do what I want, in fewer lines of code?

prrao
  • 2,656
  • 5
  • 34
  • 39
  • well looking at it you could get rid of `lag` by just using `index-1` in your second if statement, and change the second if to `!=` and that way you can drop the else and move that code up to the if – James Kent Jan 30 '15 at 18:11
  • @JamesKent That's a good idea. I updated the question and the code. Thanks. – prrao Jan 30 '15 at 18:17
  • You already have `item`, so you don't need to access `x[index]` again for the comparison to `x[index-1]` – Joel Cornett Jan 30 '15 at 18:21
  • possible duplicate of [find length of sequences of identical values in a numpy array](http://stackoverflow.com/questions/1066758/find-length-of-sequences-of-identical-values-in-a-numpy-array) – Joe Feb 10 '15 at 13:28

5 Answers5

9

It can be done using a list comprehension, with a range function

>>> x = [1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ]
[3, 6]
>>> x = [1, 1, 1, 2, 2, 2, 1, 1, 1]
>>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ]
[3, 6]
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
3

You can do something like this using itertools.izip, itertools.tee and a list-comprehension:

from itertools import izip, tee
it1, it2 = tee(x)
next(it2)
print [i for i, (a, b) in enumerate(izip(it1, it2), 1) if a != b]
# [3, 6]

Another alternative using itertools.groupby on enumerate(x). groupby groups similar items together, so all we need is the index of first item of each group except the first one:

from itertools import groupby
from operator import itemgetter
it = (next(g)[0] for k, g in groupby(enumerate(x), itemgetter(1)))
next(it) # drop the first group
print list(it)
# [3, 6]

If NumPy is an option:

>>> import numpy as np
>>> np.where(np.diff(x) != 0)[0] + 1
array([3, 6])
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • 3
    I was thinking `list(accumulate(len(list(g)) for k,g in groupby(x)))[:-1]` before I came to my senses.. – DSM Jan 30 '15 at 18:29
2

instead multi-indexing that has O(n) complexity you can use an iterator to check for the next element in list :

>>> x =[1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> i_x=iter(x[1:])
>>> [i for i,j in enumerate(x[:-1],1) if j!=next(i_x)]
[3, 6]
Mazdak
  • 105,000
  • 18
  • 159
  • 188
2

I'm here to add the obligatory answer that contains a list comprehension.

flag = [i+1 for i, value in enumerate(x[1:]) if (x[i] != value)]
Roberto
  • 2,696
  • 18
  • 31
1

itertools.izip_longest is what you are looking for:

from itertools import islice, izip_longest

flag = []
leader, trailer = islice(iter(x), 1), iter(x)
for i, (current, previous) in enumerate(izip_longest(leader, trailer)):
    # Skip comparing the last entry to nothing
    # If None is a valid value use a different sentinel for izip_longest
    if leader is None:
        continue
    if current != previous:
        flag.append(i)
Sean Vieira
  • 155,703
  • 32
  • 311
  • 293