2

I have a dataframe containing timestamps on two of its columns, and I want to substract them such that I get the time difference in hours and minutes.

ColA    Timestamp           Timestamp2            
1   06:40:00              17:40:00     
2   06:29:00              16:29:00          
3   07:05:00              15:29:00  
4   06:43:00              18:55:00   

I tried the following code but it only gives me the number of hours (an integer).

for m in range(4):
    j = df.iloc[m,0]
    d1 = df.iloc[m,2]
    d2 = df.iloc[m,1]
    td = d1-d2
    q = td.total_seconds() / 3600
    print ("Timeinterval %s is %d hours." %(j, q))

I also tried it with the function (it gives me a tuple, or if I ignore the thing after the comma I get the same result as before):

def days_hours_minutes(td):
    return td.seconds//3600, (td.seconds//60)%60

Also,

def datetime_to_float(d):
    return d.timestamp()

throws "'Timedelta' object has no attribute 'timestamp'".

The difference between the two timestamps works, but I want the output to be a float (ex: 8.5 hours).

Fabulini
  • 171
  • 1
  • 3
  • 11

2 Answers2

-1

If your Timestamp is of the type object, then first do this, else skip to the later part directly:

df['Timestamp'] = pd.to_datetime(df['Timestamp'] )
df['Timestamp2'] = pd.to_datetime(df['Timestamp2'])

Now extract the hours and minutes using dt and divide the minutes column by 60:

df['diff_h'] = (df['Timestamp2'].dt.hour - df['Timestamp'].dt.hour)
df['diff_m'] = (df['Timestamp2'].dt.minute - df['Timestamp'].dt.minute)
df['diff_m'] = round(df['diff_m']/60, 2)
df['final_diff'] = df['diff_h'].astype(float) + df['diff_m']

Remove the unnecessary columns if you want:

del df['diff_h'], df['diff_m']

Output:

   ColA           Timestamp          Timestamp2  final_diff
0     1 2019-08-13 06:40:00 2019-08-13 17:40:00        11.0
1     2 2019-08-13 06:29:00 2019-08-13 16:29:00        10.0
2     3 2019-08-13 07:05:00 2019-08-13 15:29:00         8.4
3     4 2019-08-13 06:43:00 2019-08-13 18:55:00        12.2


PS: This approach, although easier, brings in additional date (today's in this case - 13-08-2019) due to the to_datetime. Please store it in a different column if you want to preserve the timestamp in original format but you get the general idea, I hope.
Ankur Sinha
  • 6,473
  • 7
  • 42
  • 73
  • But 8 hours and 24 minutes is not 8.24 hours. Also, I tried your code and in my bigger dataframe it's not able to convert something like "11.-33" to float. – Fabulini Aug 13 '19 at 09:56
  • Sihna, sorry, we commented at the same time. To answer your last question: Yes, exactly. Two digits is enough. – Fabulini Aug 13 '19 at 09:59
  • Updated. This should do the trick :) Please check. Also, for larger datasets, avoid using loops. Use the dt operator if possible. – Ankur Sinha Aug 13 '19 at 10:00
  • Also, 11.-33 is an invalid number in the first place. Did you mean -11.33? Please post that particular row may be. – Ankur Sinha Aug 13 '19 at 10:02
-1
q = td.total_seconds() / 3600
print ("Timeinterval %s is %d hours." %(j, q))

you should change this part of code.

using %d just format int, so q show as int.

you should use %f instead.

and more, you should better use following way which show value of variable.

print("Timeinterval {} is {} hours.".format(j, q))
Asfy
  • 69
  • 9
  • 1
    nothing called total seconds – rishi jain Oct 08 '20 at 09:42
  • When `df['Timestamp'].total_seconds()` isn't defined, you could try `df['Timestamp'].dt.total_seconds()` or convert `df['Timestamp']` to another format to use `total_seconds()`. https://stackoverflow.com/questions/45884288/pandas-series-dt-total-seconds-not-found – Jeong Kim Jun 16 '21 at 15:29