0

I recently started using pandas and I am trying to teach myself training models. I have a dataset that has end_time and start_time columns and I am currently struggling to find the time elapsed between these columns in the same row in seconds.

This is the code I tried;

[IN]

from datetime import datetime
from datetime import date

st = pd.to_datetime(df['start_time'], format='%Y-%m-%d')
et = pd.to_datetime(df['end_time'], format='%Y-%m-%d')


print((et-st).dt.days)*60*60*24

[OUT]

0        0
1        0
2        0
3        0
4        0
        ..
10000    0
Length: 10001, dtype: int64

I looked up other similar questions and where this one differ is, it's connected to a CSV file. I can easily apply the steps with dummy data from the other question solutions but it doesn't work for my case.

aiox
  • 13
  • 4
  • If all of your differences are < 1 day in magnitude the `.days` attribute will be 0. Perhaps you want `(et-st).dt.total_seconds()` – ALollz Nov 17 '20 at 20:11
  • Does this answer your question? [Python - Convert datetime column into seconds](https://stackoverflow.com/questions/40992976/python-convert-datetime-column-into-seconds) – Ruthger Righart Nov 17 '20 at 20:16
  • @ALollz when I try that, I receive this error: File "", line 11 ^ SyntaxError: unexpected EOF while parsing – aiox Nov 17 '20 at 20:42
  • @RuthgerRighart I checked the same question before I post this one. Sadly it did not help. – aiox Nov 17 '20 at 20:43

1 Answers1

1

See the following. I fabricated some data, if you have a data example that produces the error please feel free to put it in the question.

import pandas as pd
from datetime import datetime
from datetime import date

df = pd.DataFrame({'start_time':pd.date_range('2015-01-01 01:00:00', periods=3), 'end_time':pd.date_range('2015-01-02 02:00:00', periods=3, freq='23H')})

st = pd.to_datetime(df['start_time'], format='%Y-%m-%d')
et = pd.to_datetime(df['end_time'], format='%Y-%m-%d')

diff = et-st

df['seconds'] = diff.dt.total_seconds()
Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33
  • When I follow this, the output I had is: 0 1 days 01:00:00 1 1 days 00:00:00 2 0 days 23:00:00 dtype: timedelta64[ns] What I really need is to calculate elapsed time for each row in a new column. I think this is quite a correct direction but not sure what is missing – aiox Nov 17 '20 at 21:34
  • Oh, wow. I just re-inserted my table and did the steps again. Without 'df = pd.DataFrame({'start_time':pd.date_range('2015-01-01 01:00:00', periods=3), 'end_time':pd.date_range('2015-01-02 02:00:00', periods=3, freq='23H')}) ' this part, IT WORKS LIKE A CHARM! Thank you!! – aiox Nov 17 '20 at 21:44
  • It's my pleasure! – Ruthger Righart Nov 17 '20 at 21:57