0

I am trying to run some code to remove duplicates from a sequence that is within a dataframe. I have approximately 3,000 rows of various sequences. An example of what I am trying to do is have A,B,B,A,D,D,E converted to A,B,A,D,E. I still need to keep the same sequence, just remove the consecutive duplicates.

I have tried zip_longest, and itertools groupby function.

The problem that I have is that there are so many rows, how would I create a for loop for this dataframe so that these functions can iterate through each 'sequence'.

Thanks so much for the help!

  • Does this question help? [Pandas: Drop consecutive duplicates](https://stackoverflow.com/questions/19463985/pandas-drop-consecutive-duplicates) – aaossa Feb 18 '22 at 20:53

1 Answers1

0

Can you try this

import pandas as pd
df = pd.DataFrame(['A','B','B','A','D','D','E'])

current =''
previous = ''
for index,row in df.iterrows():
    current = row[0]
    if current == previous:
        df.drop(index, inplace=True)
    previous = current

print (df)
Manjunath K Mayya
  • 1,078
  • 1
  • 11
  • 20