Drop only specific consequtive duplicates in a pandas dataframe

Question

I have the following dataframe, from which I need to drop consecutive duplicate values only if they equal 0.3 or 0.4.

In [2]: df = pd.DataFrame(index=pd.date_range('20020101', periods=7, freq='D'),
                              data={'poll_support': [0.3, 0.4, 0.4, 0.4, 0.3 0.5 0.5]})
    
In [3]: df
Out[3]:
                poll_support
2002-01-01           0.3
2002-01-02           0.4
2002-01-03           0.4
2002-01-04           0.4
2002-01-05           0.3
2002-01-06           0.5
2002-01-07           0.5

I need the df to look like this:

2002-01-01           0.3
2002-01-02           0.4
2002-01-05           0.3
2002-01-06           0.5
2002-01-07           0.5

I tried:

for var in df['poll_support']:
    if var == 0.3 or var == 0.4:
        df['poll_support']= df['poll_support'].loc[df['poll_support'].shift() != 0.3]
        df['poll_support']= df['poll_support'].loc[df['poll_support'].shift() != 0.4]

However, this does not produce the desired df.

I would love to hear suggestions.

Does this answer your question? [Pandas: Drop consecutive duplicates](https://stackoverflow.com/questions/19463985/pandas-drop-consecutive-duplicates) — TheFaultInOurStars, Feb 27 '22 at 21:04
@AmirhosseinKiani No, that solution does not account for duplicates of a specific type- it tackles all consecutive duplicates — arkadiy, Feb 27 '22 at 21:06

score 1 · Accepted Answer · answered Feb 27 '22 at 21:12

Boolean indexing will help. Try:

df[~((df['poll_support']==df['poll_support'].shift())&(df['poll_support'].isin([0.3,0.4])))]




             poll_support
2002-01-01           0.3
2002-01-02           0.4
2002-01-05           0.3
2002-01-06           0.5
2002-01-07           0.5

Drop only specific consequtive duplicates in a pandas dataframe

1 Answers1