5

How do I remove consecutive duplicates from a list like this in python?

lst = [1,2,2,4,4,4,4,1,3,3,3,5,5,5,5,5]

Having a unique list or set wouldn't solve the problem as there are some repeated values like 1,...,1 in the previous list.

I want the result to be like this:

newlst = [1,2,4,1,3,5]

Would you also please consider the case when I have a list like this [4, 4, 4, 4, 2, 2, 3, 3, 3, 3, 3, 3] and I want the result to be [4,2,3,3] rather than [4,2,3] .

Bharel
  • 23,672
  • 5
  • 40
  • 80
Elmahy
  • 392
  • 2
  • 15

7 Answers7

12

itertools.groupby() is your solution.

newlst = [k for k, g in itertools.groupby(lst)]

If you wish to group and limit the group size by the item's value, meaning 8 4's will be [4,4], and 9 3's will be [3,3,3] here are 2 options that does it:

import itertools

def special_groupby(iterable):
    last_element = 0
    count = 0
    state = False
    def key_func(x):
        nonlocal last_element
        nonlocal count
        nonlocal state
        if last_element != x or x >= count:
            last_element = x
            count = 1
            state = not state
        else:
            count += 1
        return state
    return [next(g) for k, g in itertools.groupby(iterable, key=key_func)]

special_groupby(lst)

OR

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return itertools.zip_longest(*args, fillvalue=fillvalue)

newlst = list(itertools.chain.from_iterable(next(zip(*grouper(g, k))) for k, g in itertools.groupby(lst)))

Choose whichever you deem appropriate. Both methods are for numbers > 0.

Bharel
  • 23,672
  • 5
  • 40
  • 80
  • It works very well, but with a list like this [4, 4, 4, 4, 2, 2, 3, 3, 3, 3, 3, 3], I want the result to be [4,2,3,3] rather than [4,2,3]. can you guide me to solve this problem? – Elmahy Aug 30 '16 at 22:50
  • @ahmedmar Why would there be `[4,2,3,3]`? `[4,2,3]` is the correct output in this case. You wanted to remove duplicates, and there is nothing in between. – Bharel Aug 30 '16 at 23:03
  • Is there a way to specify a limit to itertools to group a list by it . i.e. I want every 3,3,3 to be 3 and every 4,4,4,4 to be 4 ? – Elmahy Aug 30 '16 at 23:13
  • Yes you helped me a lot with removing duplicates. I just want a more specific solution to my case if possible. Any idea how could I do it ? – Elmahy Aug 30 '16 at 23:16
  • Another example, suppose a list [2,2,2,2,3,3,3,3] if i specified itertool to collect every two dublicate items I will get [2,2,3,3]. – Elmahy Aug 30 '16 at 23:19
  • Is there a reason not to simply use `newlst = [item[0] for item in itertools.groupby(lst)]`? – Jacob Vlijm Aug 31 '16 at 10:03
  • @jacob it is not his desired result – Bharel Aug 31 '16 at 16:20
  • Ah, sorry, missed the "Would you also please..." -section. – Jacob Vlijm Aug 31 '16 at 16:33
3
list1 = ['a', 'a', 'a', 'b', 'b' , 'a', 'f', 'c', 'a','a']
temp_list = []


for item in list1:   
   if len(temp_list) == 0:
      temp_list.append(item)

   elif len(temp_list) > 0:
      if  temp_list[-1] != item:
          temp_list.append(item)

print(temp_list)
  1. Fetch each item from the main list(list1).
  2. If the 'temp_list' is empty add that item.
  3. If not , check whether the last item in the temp_list is not same as the item we fetched from 'list1'.
  4. if items are different append into temp_list.
Fuji Komalan
  • 1,979
  • 16
  • 25
2

If you want to use the itertools method @MaxU suggested, a possible code implementation is:

import itertools as it

lst=[1,2,2,4,4,4,4,1,3,3,3,5,5,5,5,5]

unique_lst = [i[0] for i in it.groupby(lst)]

print(unique_lst)
Ben
  • 5,952
  • 4
  • 33
  • 44
0

You'd probably want something like this.

lst = [1, 1, 2, 2, 2, 2, 3, 3, 4, 1, 2]
prev_value = None
for number in lst[:]: # the : means we're slicing it, making a copy in other words
    if number == prev_value:
        lst.remove(number)
    else:
        prev_value = number

So, we're going through the list, and if it's the same as the previous number, we remove it from the list, otherwise, we update the previous number.

There may be a more succinct way, but this is the way that looked most apparent to me.

HTH.

Craig Brett
  • 2,295
  • 1
  • 22
  • 27
  • Probably better to construct new list, as removing items from a list in a for loop can cause problems – joel goldstick Aug 30 '16 at 21:43
  • We do. We're iterating over a sliced copy of the list, not the original list. So no danger of falling over ourselves with deleting while we're iterating. – Craig Brett Aug 30 '16 at 21:53
  • I missed that.. sorry! – joel goldstick Aug 30 '16 at 21:54
  • No worries - it's bitten me before :) – Craig Brett Aug 30 '16 at 21:57
  • Even with operation on a copy, this won't work correctly if the duplicated value occurred more than once place in the list. Try it on `[3, 2, 3, 3]`, for example. The issue is that `list.remove(3)` does not remove the duplicated `3` from the end, but rather than lone `3` from the start. `list.remove` is also very slow (each removal requires `O(N)` time). – Blckknght Aug 31 '16 at 00:26
  • Really? Why is that? – Craig Brett Aug 31 '16 at 07:38
0
newlist=[]    
prev=lst[0]
newlist.append(prev)
    for each in lst[:1]: #to skip 1st lst[0]
        if(each!=prev):
            newlist.append(each)  
         prev=each             
jayant singh
  • 929
  • 12
  • 17
  • 1
    While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. Code-only answers are discouraged. – Ajean Aug 30 '16 at 22:29
0
st = ['']
[st.append(a) for a in [1,2,2,4,4,4,4,1,3,3,3,5,5,5,5,5] if a != st[-1]]
print(st[1:])
vadim vaduxa
  • 221
  • 6
  • 16
  • While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. Code-only answers are discouraged. – Ajean Aug 30 '16 at 22:25
0

Check if the next element always is not equal to item. If so append.

lst = [1,2,2,4,4,4,4,1,3,3,3,5,5,5,5,5]

new_item = lst[0]
new_list = [lst[0]]
for l in lst:
   if new_item != l:
     new_list.append(l)
     new_item = l

print new_list
print lst
SuperNova
  • 25,512
  • 7
  • 93
  • 64