1

My df looks like,

    start               stop
0   2015-11-04 10:12:00 2015-11-06 06:38:00
1   2015-11-04 10:23:00 2015-11-05 08:30:00
2   2015-11-04 14:01:00 2015-11-17 10:34:00
4   2015-11-19 01:43:00 2015-12-21 09:04:00

print(time_df.dtypes)

start       datetime64[ns]
stop        datetime64[ns]

dtype: object

I am trying to find the time difference between, stop and start.

I tried, pd.Timedelta(df_time['stop']-df_time['start']) but it gives TypeError: data type "datetime" not understood

df_time['stop']-df_time['start'] also gives same error.

My expected output,

 2D,?H
 1D,?H
 ...
 ...
Pyd
  • 6,017
  • 18
  • 52
  • 109

2 Answers2

4

You need omit pd.Timedelta, because difference of times return timedeltas:

df_time['td'] = df_time['stop']-df_time['start']
print (df_time)
                start                stop               td
0 2015-11-04 10:12:00 2015-11-06 06:38:00  1 days 20:26:00
1 2015-11-04 10:23:00 2015-11-05 08:30:00  0 days 22:07:00
2 2015-11-04 14:01:00 2015-11-17 10:34:00 12 days 20:33:00

EDIT: Another solution is subtract numpy arrays:

df_time['td'] = df_time['stop'].values - df_time['start'].values
print (df_time)
                start                stop               td
0 2015-11-04 10:12:00 2015-11-06 06:38:00  1 days 20:26:00
1 2015-11-04 10:23:00 2015-11-05 08:30:00  0 days 22:07:00
2 2015-11-04 14:01:00 2015-11-17 10:34:00 12 days 20:33:00
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

First make sure that you have dates in your column

data.loc[:, 'start'] = pd.to_datetime(data.loc[:, 'start'])
data.loc[:, 'stop'] = pd.to_datetime(data.loc[:, 'stop'])

Then substract

data['delta'] = data['stop'] - data['start']
Treizh
  • 322
  • 5
  • 12
  • I guess it's just a typo, but just to be sure: sometimes you write time_df and other times df_time – Treizh Jul 17 '18 at 12:36