0

I want to subtract two datetime values and the output in HH:MM:SS format.

If the difference is more than one day, the days count needs to be added to hours.

I have two columns: started_at and ended_at

I tried to create a new column trip_duration:

df['trip_duration'] = df['ended_at'] - df['started_at']

sample table:

| stated_at | ended_at |
| -------- | -------- |
| 2022-08-18 18:16:28+00:00   | 2022-08-18 19:20:28+00:00   |
| 2022-10-07 14:21:58+00:00   | 2022-10-07 14:41:58+00:00   |
| 2022-10-10 14:21:58+00:00   | 2022-10-11 02:21:58+00:00   |

Notice that the last row the dates are different. Started at 2022-10-10 and ended at 2022-10-11.

I believe I should add some conditional for when situation happens. I mean, when the time is smaller (02-21-58+00:00 < 14-21-58+00:00), but it is from another day in the future.

The desirable output would be:

trip_duration
01:04:00
00:20:00
36:00:00
FObersteiner
  • 22,500
  • 8
  • 42
  • 72

1 Answers1

0
import pandas as pd

# Create a sample dataframe
df = pd.DataFrame({'started_at': ['2022-08-18 18:16:28+00:00', '2022-10-07 14:21:58+00:00', '2022-10-10 14:21:58+00:00'],
                   'ended_at': ['2022-08-18 19:20:28+00:00', '2022-10-07 14:41:58+00:00', '2022-10-11 02:21:58+00:00']})

# Convert the columns to datetime objects
df['started_at'] = pd.to_datetime(df['started_at'])
df['ended_at'] = pd.to_datetime(df['ended_at'])

# Create a new column 'trip_duration'
df['trip_duration'] = df['ended_at'] - df['started_at']

# Extract hours, minutes, seconds and format as string
df['trip_duration'] = df['trip_duration'].apply(lambda x: x.seconds // 3600 + x.days*24)
df['trip_duration'] = df['trip_duration'].apply(lambda x: '{:02d}:{:02d}:{:02d}'.format(x // 3600, (x % 3600) // 60, x % 60))

# Print the resulting dataframe
print(df)

Edit: fix the error the second df['trip_duration'] is not a Timedelta object.

Suren
  • 193
  • 8
  • It returns me an error at the line: df['trip_duration'] = df['trip_duration'].apply(lambda x: x.seconds // 60) --- the error: AttributeError: 'int' object has no attribute 'seconds' – Lucas Correa Jan 21 '23 at 08:03
  • It shouldn't raise an error now. – Suren Jan 21 '23 at 16:39