0

Hi I am trying to calculate time differences for certain tasks in some data I am working on. I have a csv file with a bunch of data, the relevant columns look like below:

ID Start Date End Date
123456 10/08/2021 02:00:05 AM 10/11/2021 01:00:15 AM
324524 10/11/2021 01:00:15 AM 10/08/2021 02:00:05 AM

My goal is to create a new file with the row ID, the start date, end date, and the time difference in hours.

So far I have used pandas.to_datetime to change the format of the start date and the end date. Now I am wondering how I can calculate the difference between the two times i.e. (end date - start date) in hours and create a new column in the dataframe to store it in.

markovv.sim
  • 161
  • 1
  • 8
  • Convert to datetime with `pandas.to_datetime` and perform a simple column-wise subtraction – mozway Oct 13 '21 at 03:38
  • Hi, my apologies I should have said that I already did that part. So I have already done the conversion using to_datetime and am having trouble finding out how to store the difference in hours in a new column in a dataframe. Please see my edited question. – markovv.sim Oct 13 '21 at 04:13
  • 1
    `df['diff'] = (df['End Date']-df['Start Date']).total_seconds()/3600` – mozway Oct 13 '21 at 04:17
  • i just tried doing this and now I get an error saying AttributeError: 'Series' object has no attribute 'total_seconds' – markovv.sim Oct 13 '21 at 05:03
  • Never mind, doing df['diff'] = (df['ed']-df['sd']).dt.total_seconds()/3600 worked! thanks again :) – markovv.sim Oct 13 '21 at 05:09
  • Yes sorry, I forgot the dt ;) – mozway Oct 13 '21 at 05:17

0 Answers0