Once you read data in pandas dataframe, then convert the Start
and End
column to pandas.to_datetime
:
df1 = pd.DataFrame({'Start':['21-01-2015','28-02-2019','07-04-2017','01-01-2019'],
'End':['25-11-2021','02-01-2020','10-02-2020','31-12-2019']})
df1['Start'] = pd.to_datetime(df1['Start'])
df1['End'] = pd.to_datetime(df1['End'])
#Take difference between 'End' and 'Start'
df1['diff'] = (df1['End'] - df1['Start']).dt.days
#Then use lambda function to apply the condition:
df1['Var'] = df1['diff'].apply(lambda x: 'L' if x > 364 else 'S')
print(df1)
Start End diff Var
0 2015-01-21 2021-11-25 2500 L
1 2019-02-28 2020-02-01 338 S
2 2017-07-04 2020-10-02 1186 L
3 2019-01-01 2019-12-31 364 S
#Then drop the temporary diff column
df1 = df1.drop(['diff'], axis = 1)