5

I have two columns that both contain times and I need to get the difference of the two times. I would like to add the difference of each row timestamps to a new column "time_diff". The times are only going to be 10-30 seconds apart so I need the time_diff column to be a difference in the seconds(like this format 00:00:07).

I'm really struggling with this its for my work and it is a bit out of my element. Greatly appreciate all of the answers.

Example of the format of the two columns

start_time | end_time


00:06:34 00:06:45

00:06:59 00:07:02

00:07:36 00:07:34

  • 1
    Does this answer your question? [How to calculate the time (in seconds) difference between two DateTime columns using pandas?](https://stackoverflow.com/questions/58160864/how-to-calculate-the-time-in-seconds-difference-between-two-datetime-columns-u) – RichieV Sep 01 '20 at 04:30

3 Answers3

7

First convert these into datetime format as given below:

df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])

Then, you can perform subtract operations:

df['diff'] = df['end']-df['start']

This will give you answer in HH:MM:SS

In case you want to find answers only in seconds (it will give output in total seconds of difference)

df['diff'] = (df['end']-df['start']).dt.total_seconds()
Dhiraj Bansal
  • 417
  • 3
  • 8
  • Thanks so much! Quick question, do you know how to set a conditional for the time difference column that would be able to get the numbers past a certain threshold(say like grab every row where there is a 5 second difference)? – Terrance Whitehurst Sep 02 '20 at 23:08
  • you can try as below: df[(df['diff'] > 5)] – Dhiraj Bansal Sep 03 '20 at 04:58
2

Similar to @Dhiraj

df["time_diff"] = pd.to_datetime(df["end_time"]) - pd.to_datetime(df["start_time"])

df["time_diff_secs"] = (pd.to_datetime(df["end_time"]) - pd.to_datetime(df["start_time"])).dt.total_seconds()

OUTPUT->

   start_time end_time         time_diff  time_diff_secs
0   00:06:34  00:06:45          00:00:11            11.0
1   00:06:59  00:07:02          00:00:03             3.0
2   00:07:36  00:07:34 -1 days +23:59:58            -2.0
Utpal Kumar
  • 806
  • 6
  • 10
  • Thanks so much! Quick question, do you know how to set a conditional for the time difference column that would be able to get the numbers past a certain threshold(say like grab every row where there is a 5 second difference)? – Terrance Whitehurst Sep 02 '20 at 23:08
  • You can try doing something like this: `df_new = df[df["time_diff_secs"]==5]`. This will give you every row with time difference equals (you can change the conditions) 5 and we store that in a new dataframe. – Utpal Kumar Sep 03 '20 at 02:40
1

You should be able to just do df["difference"] = df["end_time"] - df["start_time"] assuming your columns aren't strings. You can use pandas.to_datetime() to convert a column into datetime if that's the case. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

g23
  • 666
  • 3
  • 9