0

I've looked through python advice and worked out how to calculate the difference between two dates, e.g. Difference between two dates?. That works, but ... I'm working with variables in a dataframe. I'm sure I'm following the advice I've read but I'm getting:

TypeError: strptime() argument 1 must be str, not Series 

Here's the code:

df['DAYSDIFF'] = (datetime.datetime.strptime(df['SDATE'],"%d/%m/%Y") - datetime.datetime.strptime(df['QDATE'],"%d/%m/%Y"))

Thanks again for help!

Community
  • 1
  • 1
Angus
  • 137
  • 8

1 Answers1

1

Use pandas.to_datetime:

df["SDATE"] = pd.to_datetime(df["SDATE"], format="%d/%m/%Y")
df["QDATE"] = pd.to_datetime(df["QDATE"], format="%d/%m/%Y")

df["DAYSDIFF"] = df["SDATE"] - df["QDATE"]

Because datetime.strptime does not recognize the pandas Series and is expecting a string.

Simon Kirsten
  • 2,542
  • 18
  • 21
  • Thanks, that works a treat. Though the DAYSDIFF is coming out as "2 days" instead of the number 2. Is that just a formatting thing? – Angus Aug 19 '16 at 15:41
  • Yes, this is just a formatting thing. For more information on time series data read [the documentation](http://pandas.pydata.org/pandas-docs/stable/timeseries.html). – Simon Kirsten Aug 19 '16 at 15:43
  • 1
    You can divide `DAYSDIFF` by `np.timedelta(1, "D")` to get an int. – jdmcbr Aug 19 '16 at 15:48
  • that works too thanks - except it's np.timedelta64 – Angus Aug 22 '16 at 13:20