0

This is my dataframe:

import pandas as pd

df = pd.DataFrame({'a': [1, 0, 1, 0, 1], 'b':[0, 0, 0, 1 ,0]})

I want to select rows is the df that the value in a is greater than the value in b in previus row so I used shift(1):

df_shift = df.loc[df.a > df.b.shift(1)]

This gives me row 2 but I want to always preserve the first row since there is no value to compare before row 0.

This is the result that I want:

   a  b
0  1  0
2  1  0

I have read these two posts: post_1, post_2. And I have tried the following code which gives me the result that I want but obviously it has been achieved by using concat.

df_concat = pd.concat([df.iloc[[0]], df_shift], 0)

Is there another way to do it?

Amir
  • 978
  • 1
  • 9
  • 26
  • 1
    added a solution. does it work for you? – Naveed Jul 28 '22 at 16:44
  • @Naveed it works for this sample. Let me check it with my original dataframe and I will inform you within a couple hours. For now a big thumbs up for your ty :) – Amir Jul 28 '22 at 16:49

1 Answers1

1

here is one way to do it, just check that the when previous b value is null, it gets selected

df.loc[(df.a > df.b.shift(1)) | df.b.shift(1).isnull()]
    a   b
0   1   0
2   1   0
Naveed
  • 11,495
  • 2
  • 14
  • 21