0

I have a pandas Dataframe with two date columns (A and B) and I would like to create a 3rd column (C) that holds dates created using month and year from column A and the day of column B. Obviously I would need to change the day for the months that day doesn't exist like we try to create 31st Feb 2020, it would need to change it to 29th Feb 2020.

For example

import pandas as pd
df = pd.DataFrame({'A': ['2020-02-21', '2020-03-21', '2020-03-21'], 
                   'B': ['2020-01-31', '2020-02-11', '2020-02-01']})
for c in df.columns:
    dfx[c] = pd.to_datetime(dfx[c])

Then I want to create a new column C that is a new datetime that is:

year = df.A.dt.year

month = df.A.dt.month

day = df.B.dt.day

I don't know how to create this column. Can you please help?

  • Please have a look at [How to make good good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and provide a [mcve] including sample input, sample output, and code for what you've already tried based on your own research – G. Anderson Mar 23 '20 at 22:00
  • Please clarify what exactly the issue is. – AMC Mar 23 '20 at 22:12
  • I added more details to help. – user3047030 Mar 23 '20 at 23:09

1 Answers1

1

Here is one way to do it, using pandas' time series functionality:

import pandas as pd

# your example data
df = pd.DataFrame({'A': ['2020-02-21', '2020-03-21', '2020-03-21'], 
                   'B': ['2020-01-31', '2020-02-11', '2020-02-01']})
for c in df.columns:
    # keep using the same dataframe here
    df[c] = pd.to_datetime(df[c])

# set back every date from A to the end of the previous month,
# then add the number of days from the date in B
df['C'] = df.A - pd.offsets.MonthEnd() + pd.TimedeltaIndex(df.B.dt.day, unit='D')

display(df)

Result:

             A           B           C
0   2020-02-21  2020-01-31  2020-03-02
1   2020-03-21  2020-02-11  2020-03-11
2   2020-03-21  2020-02-01  2020-03-01

As you can see in row 0, this handles the case of "February 31st" not quite as you suggested, but still in a logical way.

Arne
  • 9,990
  • 2
  • 18
  • 28