2

I have a dataframe of a limited number of rows ~ 500 Max.

I would like to iterate over each row and compare a column's value to the next row.

something like:

for each row in df1
    if df1['col1'] of current row == df1['col1'] of current row+1 then drop row

I appreciate your help. Thanks,

Amir Afianian
  • 2,679
  • 4
  • 22
  • 46
  • 1
    welcome to stackoverflow, please take some time to read [ask], [mcve] and [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Umar.H Jul 28 '21 at 08:11

1 Answers1

1

I suggest the following :

index_to_drop=[]

for i in range(data.shape[0]-1):
    if df.iloc[i]['col1'] == df.iloc[i+1]['col1']:
        index_to_drop.append(i)

df.drop(index_to_drop, inplace=True)

Warning : I assume that you have an ordinal encoding of your index (0,1,2,3,4...,n). If it is nit the case, you can do the following beforehand :

df.reset_index(inplace=True)

    
Adrien
  • 433
  • 1
  • 3
  • 13
  • Works like a chrarm, thank you very much. In case I wanted to change the logic slightly to: for current row, check the value of df['col1'] for all rows and drop what matches, not just the next row. Should I be adding an inner loop like: for j in range(data.shape[0]-1) ? Thanks – DevBeginner Jul 28 '21 at 08:32