1

How can you remove consecutive duplicates of a specific value?

I am aware of the groupby() function but that deletes consecutive duplicates of any value.

See the example code below. The specific value is 2, in which I want to remove duplicates

import pandas as pd
from itertools import groupby

example = [1,1,5,2,2,2,7,9,9,2,2]
Col1 = pd.DataFrame(res)
# This does not work for just a specific number
res = [i[0] for i in groupby(Col1)] 

The resulting DataFrame would be [1,1,5,2,7,9,9,2]

Salvatore
  • 10,815
  • 4
  • 31
  • 69
  • Does this answer your question? [Removing elements that have consecutive duplicates](https://stackoverflow.com/questions/5738901/removing-elements-that-have-consecutive-duplicates) – Carlo Zanocco Aug 04 '20 at 15:02
  • Thanks for the response. I have viewed that pre-existing questions and it delete all consecutive duplicates. I am looking to delete consecutive duplicates of a specific value. – DaddiLopez11 Aug 04 '20 at 15:05

2 Answers2

1

Doing this with pandas seems overkill unless you are using pandas for other purposes, e.g.:

In []:
import itertools as it
example = [1,1,5,2,2,2,7,9,9,2,2]
[x for k, g in it.groupby(example) for x in ([k] if k == 2 else g)]

Out[]:
[1, 1, 5, 2, 7, 9, 9, 2]
AChampion
  • 29,683
  • 4
  • 59
  • 75
0

Try using the column's diff being equal to 0.

In your case, where we only care about deduplication when the value of the column is 2, we condition on the diff being nonzero or the column being not equal to 2:

import pandas as pd

example = [1,1,5,2,2,2,7,9,9,2,2]

df = pd.DataFrame(dict(a=example))
df.loc[(df.a.diff() != 0) | (df.a != 2)]
Dex Groves
  • 331
  • 1
  • 6