I have columns like this in a csv file (I load it using read_csv('fileA.csv', parse_dates=['ProcessA_Timestamp'])
)
Item ProcessA_Timestamp
'A' 2014-06-08 03:32:20
'B' 2014-06-08 03:32:20
'A' 2014-06-08 03:33:19
'C' 2014-06-08 03:33:20
'B' 2014-06-08 03:33:40
'D' 2014-06-08 03:38:20
How would I go about creating a column called ProcessA_ProcessingTime
, which would be the time difference between last time an item occurs in the table -
first time it occurs in the table.
Similarly, I have other data frames (which I'm not sure if they should be merged into one dataframe).. that have their own Process*_Timestamp
s.
Finally, I need to create a table, where the data is like this:
Item ProcessA_ProcessingTime ProcessB_ProcessingTime ... ProcessX_ProcessingTime
'A' 00:00:59 ...
'B' 00:01:21
'C' NOT FINISHED YET
'D' NOT FINISHED YET