Is there a function that allows to convert a column with "hh:mm:ss" into seconds (like 12:30:05 => (1260+30)60+5)

Question

I have a data frame with a column named "system state". There are different states, which get updated at varying time intervals of about several minutes.

I want the time spent in each state.

Example:

System state  time (of the day)
high          23:37:06
low           23:34:05
normal        22:58:05
normal        22:37:05
normal        22:31:06
low              ...
low
high 
high 
high

Do I necessarily have to calculate it manually? I tried df['time'] = df['time'].dt.strftime(%S), but it didn't work.

Using only `%S` will only get you seconds for that minute, to calculate use something like [this](https://stackoverflow.com/a/7852969) and mod with 86400, to get that day's seconds. — Zero, Jul 05 '23 at 14:58
`dt.strftime(%S)` is a syntax error, because `%S` is not in quotes... — John Gordon, Jul 05 '23 at 15:00

score 1 · Answer 1 · answered Jul 05 '23 at 15:02

If you convert your state column to a datetime, then you can subtract the start of today and get total seconds from the timedelta.

df['state'] = pd.to_datetime(df['state'])
# Get total seconds
(df.state - pd.Timestamp.now().normalize()).dt.total_seconds()

Output

0    85026.0
1    84845.0
2    82685.0
3    81425.0
4    81066.0

OCa · Answer 2 · 2023-07-07T09:07:32.503

import numpy as np
import pandas as pd
import datetime as dt

1) Replicating a part of your dataframe

states = ['high','low','normal','normal','normal']
times = ['23:37:06','23:34:05','22:58:05','22:37:05','22:31:06']
df = pd.DataFrame(columns = ['state','time'], data = list(zip(states, times)))
df
    state      time
0    high  23:37:06
1     low  23:34:05
2  normal  22:58:05
3  normal  22:37:05
4  normal  22:31:06

2) Convert string to actual time

datetime lets you convert such strings as "23:37:06" into usable time objects for pandas

df['T'] = pd.to_datetime(df['time'])
df
    state      time                   T
0    high  23:37:06 2023-07-05 23:37:06
1     low  23:34:05 2023-07-05 23:34:05
2  normal  22:58:05 2023-07-05 22:58:05
3  normal  22:37:05 2023-07-05 22:37:05
4  normal  22:31:06 2023-07-05 22:31:06

df.dtypes
state            object
time             object
T        datetime64[ns]
dtype: object

3) Calculate the time interval from a "state update" to the next

.shift(1) lets you work on a shifted copy of the time column so calculating a difference is straightforward.

Note that the latest state duration is unknown with the current given, and so it appears as "Not a Time" (NaT).

df['NextT'] = df['T'].shift(1)
df['DeltaT'] = df['NextT'] - df['T']
df
    state      time                   T               NextT          DeltaT
0    high  23:37:06 2023-07-05 23:37:06                 NaT             NaT
1     low  23:34:05 2023-07-05 23:34:05 2023-07-05 23:37:06 0 days 00:03:01
2  normal  22:58:05 2023-07-05 22:58:05 2023-07-05 23:34:05 0 days 00:36:00
3  normal  22:37:05 2023-07-05 22:37:05 2023-07-05 22:58:05 0 days 00:21:00
4  normal  22:31:06 2023-07-05 22:31:06 2023-07-05 22:37:05 0 days 00:05:59

4) Display your time interval column in seconds:

df['Delta seconds'] = df['DeltaT'].dt.seconds
df[['state','Delta seconds']]
    state  Delta seconds
0    high            NaN
1     low          181.0
2  normal         2160.0
3  normal         1260.0
4  normal          359.0

5) Finally add up time intervals from identical states

pd.pivot_table(data = df,
               values = ['Delta seconds'],
               index = ['state'],
               aggfunc = np.sum)

        Delta seconds
state                
high              0.0
low             181.0
normal         3779.0

Annex: strftime

Provided you had already converted 'time' column to datetime, dt.strftime extracts the number of e.g. seconds, not calculate total seconds.

df['T'].dt.strftime('%S')
0    06
1    05
2    05
3    05
4    06
Name: T, dtype: object

Is there a function that allows to convert a column with "hh:mm:ss" into seconds (like 12:30:05 => (12*60+30)*60+5)

2 Answers2