3

I have the following dataframe.
The dates are dtype='datetime64[ns, UTC]'

    announce_date  announce_time
0   2013-01-23     After Market Close
1   2013-04-23     Before Market Open
2   2013-07-20     After Market Close

How do I create a new column that offsets
announce_date by 1 if announce_time is "After Market Close".

    announce_date  announce_time          impact_date
0   2013-01-23     After Market Close     2013-01-24
1   2013-04-23     Before Market Open     2013-04-23
2   2013-07-20     After Market Close     2013-07-21

I can do an offset date column but haven't figure out how to apply the condition.

import pandas as pd

df['impact_date'] = df['announce_date'] + pd.DateOffset(days=1)
cs95
  • 379,657
  • 97
  • 704
  • 746
GollyJer
  • 23,857
  • 16
  • 106
  • 174

3 Answers3

3

Generate a boolean mask and generate offsets from it using pd.to_timedelta.

# Convert "announce_date" to datetime if not already done.
# df['announce_date'] = pd.to_datetime(df['announce_date'], errors='coerce')
# Now calculate the offset.
offset = pd.to_timedelta(
    df['announce_time'].eq("After Market Close").astype(int), unit='D')    
df['impact_date'] = df['announce_date'] + offset

print(df)
  announce_date       announce_time impact_date
0    2013-01-23  After Market Close  2013-01-24
1    2013-04-23  Before Market Open  2013-04-23
2    2013-07-20  After Market Close  2013-07-21
cs95
  • 379,657
  • 97
  • 704
  • 746
1

You can use np.where

df['impact_date'] = df['announce_date'] + np.where(df['announce_time']=="After Market Close",pd.DateOffset(days=1),pd.DateOffset(days=0))

This answer has detailed explanation.

Sach
  • 845
  • 3
  • 9
  • 22
  • Thanks Sach. I swear I tried this approach and it didn't work. For anyone playing along this works but throws a warning. `/usr/local/lib/python2.7/dist-packages/pandas/core/ops.py:477: PerformanceWarning: Adding/subtracting array of DateOffsets to Series not vectorized "Series not vectorized", PerformanceWarning)` – GollyJer Nov 07 '18 at 00:36
  • @GollyJer If you're generating a series of timedeltas, the recommended method of doing so is using `pd.to_timedelta`. See my answer. – cs95 Nov 07 '18 at 00:36
1

You could just create a while loop traversing through the announce_time like this...

count = 0
list1 = []
while count != len(df):
    if df.iloc[count,1] == 'After Market Close':
        list1.append(df.iloc[count,0] + pd.DateOffset(days=1))
        count += 1
    else:
        list1.append(df.iloc[count,0])
        count += 1

df['impact_date'] = list1