1

I am trying to replace the values of 3 columns within multiple observations based on two conditionals ( e.g., specific ID after a particular date).

I have seen similar questions.

  1. Pandas Multiple Conditions Function based on Column

  2. Pandas replace, multi column criteria

  3. Pandas: How do I assign values based on multiple conditions for existing columns?

  4. Replacing values in a pandas dataframe based on multiple conditions

However, they did not quite address my problem or I can't quite manipulate them to solve my problem.

This code will generate a dataframe similar to mine:

df = pd.DataFrame({'SUR_ID': {0:'SUR1', 1:'SUR1', 2:'SUR1', 3:'SUR1', 4:'SUR2', 5:'SUR2'}, 'DATE': {0:'05-01-2019', 1:'05-11-2019', 2:'06-15-2019', 3:'06-20-2019', 4: '05-15-2019', 5:'06-20-2019'}, 'ACTIVE_DATE': {0:'05-01-2019', 1:'05-01-2019', 2:'05-01-2019', 3:'05-01-2019', 4: '05-01-2019', 5:'05-01-2019'}, 'UTM_X': {0:'444895', 1:'444895', 2:'444895', 3:'444895', 4: '445050', 5:'445050'}, 'UTM_Y': {0:'4077528', 1:'4077528', 2:'4077528', 3:'4077528', 4: '4077762', 5:'4077762'}})

Output Dataframe:

enter image description here

What I am trying to do:

I am trying to replace UTM_X,UTM_Y, AND ACTIVE_DATE with

[444917, 4077830, '06-04-2019']

when

SUR_ID is "SUR1" and DATE >= "2019-06-04 12:00:00"

This is a poorly adapted version of the solution for question 1 in attempts to fix my problem- throws error:

df.loc[[df['SUR_ID'] == 'SUR1' and df['DATE'] >='2019-06-04 12:00:00'], ['UTM_X', 'UTM_Y', 'Active_Date']] = [444917, 4077830, '06-04-2019']
Devon Oliver
  • 295
  • 1
  • 7
  • 20

2 Answers2

1

First ensure that the column Date is of type datetime, and then when using 2 conditions, they need to be between parenthesis individually. so you can do:

df.DATE = pd.to_datetime(df.DATE)
df.loc[ (df['SUR_ID'] == 'SUR1') & (df['DATE'] >= pd.to_datetime('2019-06-04 12:00:00')), 
        ['UTM_X', 'UTM_Y', 'ACTIVE_DATE']] = [444917, 4077830, '06-04-2019']

See the difference between what you wrote for the boolean mask:

[df['SUR_ID'] == 'SUR1' and df['DATE'] >='2019-06-04 12:00:00'] 

and what is here with parenthesis

(df['SUR_ID'] == 'SUR1') & (df['DATE'] >= pd.to_datetime('2019-06-04 12:00:00'))
Ben.T
  • 29,160
  • 6
  • 32
  • 54
  • Thank you for the clarification. Nice to know I wasn't too far of base. – Devon Oliver Aug 28 '19 at 20:30
  • I had date set to date time in the dataframe, but I did not set it to datetime in the conditional. – Devon Oliver Aug 28 '19 at 20:37
  • @DevonOliver indeed not too far :) and if your column DATE is already of datetime, then converting the string to datetime in the conditional may not be necessary, but it is more rigorous – Ben.T Aug 28 '19 at 20:43
1

Use:

df['UTM_X']=df['UTM_X'].mask(df['SUR_ID'].eq('SUR1') & (pd.to_datetime(df['DATE'])>= pd.to_datetime("2019-06-04 12:00:00")),444917)
df['UTM_Y']=df['UTM_Y'].mask(df['SUR_ID'].eq('SUR1') & (pd.to_datetime(df['DATE'])>= pd.to_datetime("2019-06-04 12:00:00")),4077830)
df['ACTIVE_DATE']=df['ACTIVE_DATE'].mask(df['SUR_ID'].eq('SUR1') & (pd.to_datetime(df['DATE'])>= pd.to_datetime("2019-06-04 12:00:00")),'06-04-2019')

Output:

    SUR_ID  DATE        ACTIVE_DATE UTM_X   UTM_Y
0   SUR1    05-01-2019  05-01-2019  444895  4077528
1   SUR1    05-11-2019  05-01-2019  444895  4077528
2   SUR1    06-15-2019  06-04-2019  444917  4077830
3   SUR1    06-20-2019  06-04-2019  444917  4077830
4   SUR2    05-15-2019  05-01-2019  445050  4077762
5   SUR2    06-20-2019  05-01-2019  445050  4077762
ansev
  • 30,322
  • 5
  • 17
  • 31