0

I am trying to create a new columns based on two other column values with one column being datetime. Below is an example of the data. I would like to get the result shown in the second table.

Site Name Date
Westwood 2022-11-15
Westwood 2022-11-16
Northend 2021-08-04
Northend 2021-08-05
Northend 2021-08-06

I'm hoping to get the below result.

Site Name Date Active_Ind
Westwood 2022-11-15 0
Westwood 2022-11-16 0
Northend 2021-08-06 0
Northend 2021-08-05 1
Northend 2021-08-04 1

Here is the what I have tried so far but I keep getting the error '<' not supported between instances of 'Timestamp' and 'int'

def conditions(df):
    if (df['Site Name']=='Northend') & (df['Date'] < 2021-08-06):
        return 1
    else:
        return 0

df['Active_Ind']=df.apply(conditions,axis=1)

1 Answers1

0

Example

data = {'Site Name': {0: 'Westwood', 1: 'Westwood', 2: 'Northend', 3: 'Northend', 4: 'Northend'},
        'Date': {0: '2022-11-15', 1: '2022-11-16', 2: '2021-08-04', 3: '2021-08-05', 4: '2021-08-06'}}
df = pd.DataFrame(data)

df

    Site Name   Date
0   Westwood    2022-11-15
1   Westwood    2022-11-16
2   Northend    2021-08-04
3   Northend    2021-08-05
4   Northend    2021-08-06

Code

ordering of 1st table and 2nd table is different. I'll think of it as typo

df['Date'] = pd.to_datetime(df['Date'])
s = df['Site Name'].eq('Northend') & df['Date'].lt(pd.Timestamp('2021-08-06'))
df['Active_ind'] = s.astype('int')

df

    Site Name   Date        Active_ind
0   Westwood    2022-11-15  0
1   Westwood    2022-11-16  0
2   Northend    2021-08-04  1
3   Northend    2021-08-05  1
4   Northend    2021-08-06  0
Panda Kim
  • 6,246
  • 2
  • 12