8

I have a field for call length in my raw data which is listed as an object, such as: 00:10:30 meaning 10 minutes and 30 seconds. How can I convert this to a number like 10.50?

I keep getting errors. If convert the fields with pd.datetime then I can't do an .astype('float'). In Excel, I just multiple the time stamp by 1440 and it outputs the number value I want to work with. (Timestamp * 24 * 60)

trench
  • 5,075
  • 12
  • 50
  • 80

3 Answers3

11

You can use time deltas to do this more directly:

In [11]: s = pd.Series(["00:10:30"])

In [12]: s = pd.to_timedelta(s)

In [13]: s
Out[13]:
0   00:10:30
dtype: timedelta64[ns]

In [14]: s / pd.offsets.Minute(1)
Out[14]:
0    10.5
dtype: float64
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • 1
    If you do this on a column that isn't converted to a datetime (just timestamp strings), you get `ValueError: only leading negative signs are allowed` (I'm using pandas version 0.25.3) I tried converting to datetime first, and got `/opt/anaconda3/envs/incident_finder/lib/python3.7/site-packages/pandas/util/_decorators.py:208: FutureWarning: Passing datetime64-dtype data to TimedeltaIndex is deprecated, will raise a TypeError in a future version return func(*args, **kwargs)` – szeitlin Apr 16 '20 at 21:03
  • In this case, you can just do ```s = pd.to_timedelta(pd.to_datetime(s))``` – jtb Mar 12 '23 at 21:34
2

I would convert the string to a datetime and then use the dt accessor to access the components of the time and generate your minutes column:

In [16]:

df = pd.DataFrame({'time':['00:10:30']})
df['time'] = pd.to_datetime(df['time'])
df['minutes'] = df['time'].dt.hour * 60 + df['time'].dt.minute + df['time'].dt.second/60
df
Out[16]:
                 time  minutes
0 2015-02-05 00:10:30     10.5
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • I really hate this behaviour of `to_datetime` (to take today's date) - IMO this is almost never what you want. :( – Andy Hayden Feb 05 '15 at 20:32
  • @AndyHayden it would be preferable to pass a default date or just have no date at atll, it does mean having to perform a second pass on the data to fix the date – EdChum Feb 05 '15 at 20:34
  • I think it's just too forgiving here, `to_datetime` just never wants to raise (no matter how bad the data, it finds a way!!) – Andy Hayden Feb 05 '15 at 20:35
1

There is probably a better way of doing this, but this will work.

from datetime import datetime
import numpy as np

my_time = datetime.strptime('00:10:30','%H:%M:%S')
zero_time = datetime.strptime('00:00:00','%H:%M:%S')
x = my_time - zero_time
x.seconds

Out[25]: 630
Liam Foley
  • 7,432
  • 2
  • 26
  • 24