-1

I have a dataset with two columns: Actual Time and Promised Time (representing the actual and promised start times of some process).

For example:

import pandas as pd
example_df = pd.DataFrame(columns = ['Actual Time', 'Promised Time'],
                 data = [
                     ('2016-6-10 9:00', '2016-6-10 9:00'),
                     ('2016-6-15 8:52', '2016-6-15 9:52'),
                     ('2016-6-19 8:54', '2016-6-19 9:02')]).applymap(pd.Timestamp)

So as we can see, sometimes Actual Time = Promised Time, but there are also cases where Actual Time < Promised Time.

I defined a column that shows the difference between these two columns (example_df['Actual Time']-example_df['Promised Time']), but the problem is that for the third row it returned -1 day +23:52:00 instead of - 00:08:00.

Itamar Mushkin
  • 2,803
  • 2
  • 16
  • 32
shirin elahi
  • 105
  • 1
  • 9
  • can you reformat your data? It's hard to read this way.. are there multiple rows or is this a single row? – algorythms Sep 25 '19 at 05:12
  • 2
    Where is the code? – Aryerez Sep 25 '19 at 05:14
  • I've suggested an edit to replace your example data with code that generates it. Know that if you use columns with spaces, it's hard to use `pd.read_clipboard()` to copy your data. To learn more, please visit this great question: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Itamar Mushkin Sep 25 '19 at 05:18
  • So you want absolute values, no `-00:08:00` ? Then please change answer, because question is how change `-1 day +23:52:00` to `- 00:08:00` and accepting answer change to `00:08:00` – jezrael Sep 26 '19 at 04:49

3 Answers3

2

I assume your dataframe already in datetime dtype. abs works just fine

Without abs

df['Actual Time'] - df['Promised Time']

Out[526]:
0            00:00:00
1   -1 days +23:00:00
2   -1 days +23:52:00
dtype: timedelta64[ns]

With abs

abs(df['Promised Time'] - df['Actual Time'])

Out[529]:
0   00:00:00
1   01:00:00
2   00:08:00
dtype: timedelta64[ns]
Andy L.
  • 24,909
  • 4
  • 17
  • 29
2

Sample:

print (df)
       Actual Time   Promised Time
0   2016-6-10 9:00  2016-6-10 9:00
1  2016-6-15 10:52  2016-6-15 9:52 <- changed datetimes
2   2016-6-19 8:54  2016-6-19 9:02

def format_timedelta(x):
    ts = x.total_seconds()
    if ts >= 0:
        hours, remainder = divmod(ts, 3600)
        minutes, seconds = divmod(remainder, 60)
        return ('{}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds)) 
    else:
        hours, remainder = divmod(-ts, 3600)
        minutes, seconds = divmod(remainder, 60)
        return ('-{}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds)) 

First create datetimes:

df['Actual Time'] = pd.to_datetime(df['Actual Time'])
df['Promised Time'] = pd.to_datetime(df['Promised Time'])

And then timedeltas:

df['diff'] = (df['Actual Time'] - df['Promised Time'])

If convert negative timedeltas to seconds by Series.dt.total_seconds it working nice:

df['diff1'] = df['diff'].dt.total_seconds()

But if want negative timedeltas in string representation it is possible with custom function, because strftime for timedeltas is not yet implemented:

df['diff2'] = df['diff'].apply(format_timedelta)
print (df)
          Actual Time       Promised Time              diff   diff1     diff2
0 2016-06-10 09:00:00 2016-06-10 09:00:00          00:00:00     0.0   0:00:00
1 2016-06-15 10:52:00 2016-06-15 09:52:00          01:00:00  3600.0   1:00:00
2 2016-06-19 08:54:00 2016-06-19 09:02:00 -1 days +23:52:00  -480.0  -0:08:00
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

The difference result is timedelta type which by default is in ns format.

You need to change the type of your result to you desired format:

import pandas as pd

df=pd.DataFrame(data={
'Actual Time':['2016-6-10 9:00','2016-6-15 8:52','2016-6-19 8:54'],
'Promised Time':['2016-6-10 9:00','2016-6-15 9:52','2016-6-19 9:02']
},dtype='datetime64[ns]')

# here you need to add the `astype` part and to determine the unit you want
df['diff']=(df['Actual Time']-df['Promised Time']).astype('timedelta64[m]')


Adirmola
  • 783
  • 5
  • 15