0

I would like to convert a column from a large dataset that has the value of time as an object to an integer.

I would like to convert "Duration", "Talk Time" to int so I can perform some basic math such as the mean, the maximum and the minimum time recorded in the dataset.

Duration Talk Time
0 00:00:27 00:00:26
1 00:00:58 00:00:58
2 00:00:48 00:00:46
3 00:00:38 00:00:38
4 00:00:11 00:00:11

df['Duration'].astype(np.int64)

I get the error:

ValueError: invalid literal for int() with base 10: '00:00:27'

Leonardo
  • 21
  • 7
  • 1
    "The current time is display as 00:01:56." What integer do you want to get if you convert that time, and why? According to what logic? – Karl Knechtel May 19 '22 at 21:22
  • Does https://stackoverflow.com/questions/54312802/pandas-convert-from-datetime-to-integer-timestamp answer the question? – Karl Knechtel May 19 '22 at 21:24
  • I would like to get the average time and some other calculations but I cannot do that because it's formatted as an object. – Leonardo May 19 '22 at 21:25
  • Thank you for the help Karl but that question does not answer my problem since I am not interested in dates. I am only interested about the time. – Leonardo May 19 '22 at 21:28
  • What does "average time" mean? What should the "average" of noon and midnight be - 6pm? 6am? something else? – Karl Knechtel May 19 '22 at 21:29
  • Does https://stackoverflow.com/questions/27907902/datetime-objects-with-pandas-mean-function answer the question? – Karl Knechtel May 19 '22 at 21:30

1 Answers1

0

Assuming you want to convert to seconds (and that is HH:MM:SS representation) and that is always an 8 digit string, the fastest way would be: df['dur_in_secs'] = int(df['duration'][0:2])*3600 + int(df['duration'][3:5])*60 + int(df['duration'][6:8])

if it is not, try to isolate each part with

 s_idx = df['duration'].rfind(':') # index to last part
 sec = int(df['duration'][s_idx+1:])

etc

Sorry about that, trying to be clever :-(

this has been tested

df["secs"] = df["duration"].apply(lambda x:int(x[0:2])*3600+ int(x[3:5])*60 + int(x[6:8]))
Leonardo
  • 94
  • 1
  • 7
  • Thank you for your answer. I tried that code and I got this error: TypeError: 'float' object is not subscriptable – Leonardo May 23 '22 at 15:13