I am new to python but have been in the programming world for a while. I have already tried to do the following with if else loop using python dataframes and iloc
and was successful. I want to use vectorization. The problem is that once a data changes its state based upon rule 1 (b < a) I want it to remain in that state as long as rule 2 (c < a) applies.
So the input and output will look like the following:
I have tried numpy select and numpy where with shift but the compare happens upon the same column so I couldn't get it to work. It works the first time when the rule 2 applies not after that. since above is the sample problem I created from the one I have so that code is hard to provide
The Flag is what I expect but i don't get it
df is with loops and df1 is with numpy - I can't get it to look same
import pandas as pd
import numpy as np
df = pd.DataFrame({"a": [20,25,30,32,25,26,30,24,25,30,30,35,20,22,25,28,29,31,25,21], "b": [26,28,26,35,34,34,31,26,28,29,31,32,18,24,30,35,32,35,36,30], "c":[18,19,20,22,23,24,34,22,23,31,32,38,22,21,22,21,18,22,28,31]})
df1 = df
position=''
Flag = False
df = df.assign(flag=Flag)
for id1, id2 in zip(df.iterrows(), df.iloc[1:].iterrows()):
if id1[1]['b'] < id1[1]['a'] :
Flag=True
position='rule1True'
elif ( id1[1]['c'] < id1[1]['a'] and (position =='rule1True' or position == 'rule2True') ):
Flag = True
position = 'rule2True'
else :
Flag = False
position = ''
df.at[id1[0],'flag'] = Flag
print(df)
df1['rule1'] = np.select([df1['b'] < df1['a']],[True], default= False)
df1['rule2'] = np.select([( df1['rule1'].shift(1) & (df1['c'] < df1['a']))],[True], default= False)
df1['flag'] = np.select([( df1['rule1'] | df1['rule2'])],[True], default= False)
print(df1)