0

I used following condition to prepare a dataframe for analysis.

def DAYS(DF_FRAME_NAME):
    if (DF_FRAME_NAME['Column1'] == 'Yes' and (DF_FRAME_NAME['DAYS'] == 1)):
        return 3
    elif (DF_FRAME_NAME['Column1'] == 'Yes' and (DF_FRAME_NAME['DAYS'] == 2)):
        return 3
    if (DF_FRAME_NAME['Column2'] == 'YES' and (DF_FRAME_NAME['DAYS'] == 1)):
        return 3
    elif (DF_FRAME_NAME['Column2'] == 'YES' and (DF_FRAME_NAME['DAYS'] == 2)):
        return 3
    elif (DF_FRAME_NAME['Column2'] == 'YES' and (DF_FRAME_NAME['DAYS'] == 3)):
        return 5
    elif (DF_FRAME_NAME['Column2'] == 'YES' and (DF_FRAME_NAME['DAYS'] == 4)):
        return 5
    elif (DF_FRAME_NAME['Column2'] == 'YES' and (DF_FRAME_NAME['DAYS'] == 5)):
        return 5
    elif (DF_FRAME_NAME['Column1'] != 'Yes'):
        return DF_FRAME_NAME['DAYS']
    elif (DF_FRAME_NAME['Column2']) != 'YES':
        return DF_FRAME_NAME['DAYS']
df['DAYS'] = df.apply(DAYS, axis =1)

Above function consume time therefore with the help of @deadshot I re-write the code as follows:

DF['DAYS']=np.where((DF['column1'].eq('Yes')) &(DF['DAYS'].eq(1)),3,DF[‘DAYS’])
    DF[‘DAYS’]=np.where((DF[‘column1’].eq('Yes')) & (DF[‘DAYS’].eq(2)),3,DF[‘DAYS’])
    DF[‘DAYS’]=np.where((DF[‘column2’].eq('YES')) & (DF[‘DAYS’].eq(1)),3,DF[‘DAYS’])
    DF[‘DAYS’]=np.where((DF[‘column2’].eq('YES')) & (DF[‘DAYS’].eq(2),3,DF[‘DAYS’])
    DF[‘DAYS’]=np.where((DF[‘column2’].eq('YES')) & (DF[‘DAYS’].eq(3)),5,DF[‘DAYS’])
    DF[‘DAYS’]=np.where((DF[‘column2’].eq('YES')) & (DF[‘DAYS’].eq(5)),5,DF[‘DAYS’])
    DF[‘DAYS’]=np.where((DF[‘column2’].eq('YES')) & (DF[‘DAYS’].eq(5)),5,DF[‘DAYS’])
    DF[‘DAYS’]=np.where(DF[‘column1’].eq('No'),DF[‘DAYS’],1)
    DF[‘DAYS’]=np.where(DF[‘column2’].eq('No'),DF[‘DAYS’],1)

Peformance-wise above code is much faster when applying to dataframe over 200,000 rows. But is it correct method? Can I use a custom function inside a where function? Please help how to write the above codes more appealing because there are many conditions applicable to same column.

Philip Kendall
  • 4,304
  • 1
  • 23
  • 42

0 Answers0