Removing elements that have consecutive duplicates

Question

I was curios about the question: Eliminate consecutive duplicates of list elements, and how it should be implemented in Python.

What I came up with is this:

list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0

while i < len(list)-1:
    if list[i] == list[i+1]:
        del list[i]
    else:
        i = i+1

Output:

[1, 2, 3, 4, 5, 1, 2]

Which I guess is ok.

So I got curious, and wanted to see if I could delete the elements that had consecutive duplicates and get this output:

[2, 3, 5, 1, 2]

For that I did this:

list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0
dupe = False

while i < len(list)-1:
    if list[i] == list[i+1]:
        del list[i]
        dupe = True
    elif dupe:
        del list[i]
        dupe = False
    else:
        i += 1

But it seems sort of clumsy and not pythonic, do you have any smarter / more elegant / more efficient way to implement this?

For very long lists consider using NumPy: [Remove following duplicates in a numpy array](https://stackoverflow.com/questions/37839928/remove-following-duplicates-in-a-numpy-array) — Georgy, Jul 18 '19 at 12:55

score 100 · Accepted Answer · edited Feb 23 '22 at 00:13

100

>>> L = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [key for key, _group in groupby(L)]
[1, 2, 3, 4, 5, 1, 2]

For the second part

>>> [k for k, g in groupby(L) if len(list(g)) < 2]
[2, 3, 5, 1, 2]

If you don't want to create the temporary list just to take the length, you can use sum over a generator expression

>>> [k for k, g in groupby(L) if sum(1 for i in g) < 2]
[2, 3, 5, 1, 2]

edited Feb 23 '22 at 00:13

wjandrea

28,235
9
60
81

answered Apr 21 '11 at 02:45

John La Rooy

295,403
53
369
502

Note that the [docs](https://docs.python.org/3/library/itertools.html#itertools.groupby) state that "the iterable needs to already be sorted", which is not the case for the input data here. Still, the proposed solution worked perfectly for the cases I've tested. – normanius Dec 29 '19 at 23:56
4

@normanius, to remove all duplicates, the input would need to be sorted. This question is only about removing consecutive duplicates, so you would not want to sort in this case. – John La Rooy Jan 03 '20 at 00:58
2

`sum(1 for i in y) < 2` is still somewhat wasteful, looking at every group elements and summing potentially many 1s. `all(0 for _ in y for _ in y)` only looks at up to two group elements. Plus I think it's more interesting :-) – Kelly Bundy Dec 19 '21 at 21:29
2

@Kelly Another idea, more explicit than the nested one: `len(tuple(islice(g, 2))) < 2`. IDK if it would perform differently though. – wjandrea Feb 23 '22 at 00:21
@wjandrea Seems to be slightly faster for [short groups](https://tio.run/##lVHbasMwDH33VwhGiT3MyGWF0a1/0D8oYaRFybw6drCV0X59ZjtNu8fOepF0jo5kabjQlzXV2@CmqXW2B1I9KgLVD9YROBywIcY80jjAFrIsY4mmCB1Zq/3C7Jwdh8NFgvJaHZHtAntfyLuVspKvwdbRr@EZijzPg7AhZVAHtj1845G4YLEJQx8FGISX7U/QWgcnCR0os7TiOwGqBT/2vEi4SqCADyjrTD5Q2mjN84R/JvCPKx5T0Gh42IxGPn@bdxJKIf4xg8Ez8TC08mAsXUMJy14SsARRsWbsNqVrTIe8EpvUJ6YxptHPifgobLFXhs@H5BiVw8ASzNgf0G0Leb3xtlwLcSsbnDLEs9VL1ULvIYMVcIo3w0pIwDvxCfCMR55Exfu1Dn8azVHsN2Vez9Q5L6bpFw) and ... – Kelly Bundy Feb 23 '22 at 06:47
... equally fast for [long groups](https://tio.run/##lVFZbsMgEP3nFCNVEVChyksjVWlzg9zAsionGrs0GCzAlXN6F7CzfLbwA/OWeZoZLv7L6PJtsPPcWtODlz1KD7IfjPVgccDGE@LQjwPsgVJKEk16tN4Y5a7MzppxOF4ESKfkCckhsKsJWmNhAqmhysX9FqIUr@Fu47t@IE01PEOZZVkdWmovNargY47fePKMk9ieoIvWBMKh1TmJzwK6qF9DsAMH2YIbe5YnXCaQwwcUNRV/kDZKsSzhnwl8ePK/OSjULMxMIVsGwjoBBef/yKBx8iyElg608etXwHUuCbh@omNNyC2lbXSHrOS71CeWMZbRLYV4fJhiLzVbVswwOofAAvTYH9Huc7Fuf19sOb/JBiu1Z3TzUrbQO6CwAebD0nIsuQC8E58AJzyxZMrfVx3@NIohr3ZFVi/Upc7n@Rc). – Kelly Bundy Feb 23 '22 at 06:47

Ulf Aslak · Answer 2 · 2018-02-05T14:13:36.030

31

Oneliner in pure Python

[v for i, v in enumerate(your_list) if i == 0 or v != your_list[i-1]]

edited Feb 05 '18 at 14:13

answered Oct 27 '17 at 14:18

Ulf Aslak

7,876
4
34
56

score 14 · Answer 3 · answered Jul 13 '20 at 19:26

14

If you use Python 3.8+, you can use assignment expression :=:

list1 = [1, 2, 3, 3, 4, 3, 5, 5]

prev = object()
list1 = [prev:=v for v in list1 if prev!=v]

print(list1)

Prints:

[1, 2, 3, 4, 3, 5]

answered Jul 13 '20 at 19:26

Andrej Kesely

168,389
15
48
91

score 4 · Answer 4 · answered Jul 13 '20 at 19:26

4

A "lazy" approach would be to use itertools.groupby.

import itertools

list1 = [1, 2, 3, 3, 4, 3, 5, 5]
list1 = [g for g, _ in itertools.groupby(list1)]
print(list1)

outputs

[1, 2, 3, 4, 3, 5]

answered Jul 13 '20 at 19:26

DeepSpace

78,697
11
109
154

score 3 · Answer 5 · answered Jul 13 '20 at 19:26

You can do this by using zip_longest() + list comprehension.

from itertools import zip_longest 
list1 = [1, 2, 3, 3, 4, 3, 5, 5].
     # using zip_longest()+ list comprehension       
     res = [i for i, j in zip_longest(list1, list1[1:]) 
                                                            if i != j] 
        print ("List after removing consecutive duplicates : " +  str(res))

score 2 · Answer 6 · answered Apr 18 '19 at 22:59

Here is a solution without dependence on outside packages:

list = [1,1,1,1,1,1,2,3,4,4,5,1,2] 
L = list + [999]  # append a unique dummy element to properly handle -1 index
[l for i, l in enumerate(L) if l != L[i - 1]][:-1] # drop the dummy element

Then I noted that Ulf Aslak's similar solution is cleaner :)

score 1 · Answer 7 · edited Jul 29 '20 at 12:37

1

To Eliminate consecutive duplicates of list elements; as an alternative, you may use itertools.zip_longest() with list comprehension as:

>>> from itertools import zip_longest

>>> my_list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> [i for i, j in zip_longest(my_list, my_list[1:]) if i!=j]
[1, 2, 3, 4, 5, 1, 2]

edited Jul 29 '20 at 12:37

Georgy

12,464
7
65
73

answered Dec 13 '16 at 20:43

Moinuddin Quadri

46,825
13
96
126

score 1 · Answer 8 · edited Jul 29 '20 at 12:42

1

Plenty of better/more pythonic answers above, however one could also accomplish this task using list.pop():

my_list = [1, 2, 3, 3, 4, 3, 5, 5]
for x in my_list[:-1]:
    next_index = my_list.index(x) + 1
    if my_list[next_index] == x:
        my_list.pop(next_index)

outputs

[1, 2, 3, 4, 3, 5]

edited Jul 29 '20 at 12:42

Georgy

12,464
7
65
73

answered Jul 13 '20 at 19:34

plum 0

652
9
21

score 0 · Answer 9 · answered Jan 05 '20 at 12:21

Another possible one-liner, using functools.reduce (excluding the import) - with the downside that string and list require slightly different implementations:

>>> from functools import reduce

>>> reduce(lambda a, b: a if a[-1:] == [b] else a + [b], [1,1,2,3,4,4,5,1,2], [])
[1, 2, 3, 4, 5, 1, 2]

>>> reduce(lambda a, b: a if a[-1:] == b else a+b, 'aa  bbb cc')
'a b c'

Removing elements that have consecutive duplicates

9 Answers9

Linked

Related