-1

So I am working on a project and would like to use/learn pandas since it seems very powerful for what I am trying to do....

CONSTRAINTS I cannot change the fact that I have python code reading in multiple files to construct the dictionary. This is how is has to be and I must live with it and create the dataframe from that dictionary of lists.

Problem 1: I am getting dictionary passed to me. The dictionary contains multiple numerical items. Each item contains a list of data. This data has a time field that I am trying to compute the difference of between each row. Right now I am reading in a single numerical item, transposing it to get it into the correct structure and then trying to perform the .diff() operation on it... For example...

'400' : [ [ {'IDs'} ], {'TIME COL'}, ... ] 

I then do the following command...

df = pd.DataFrame(myData['400'])

which I then do the transpose of such that all my time's are now in rows and not in columns.

Question 1: Is this the correct way of doing it? Since I want to take the difference of the time, I was reading that the diff works on rows and not columns. If this is not correct how should I be doing it?

Problem 2: So for now the transpose works... However, when I try to do the diff operation on any column I get the following error...

TypeError: unsupported operand type(s) for -: 'str' and 'str'

Question 2: Am I reading the data in wrong? Should I be doing something else to convert it such that it recognizes it as a integer dataset so I can do this operation?

Any insight is welcome!

Sharki
  • 375
  • 2
  • 14
  • 1
    It's not very clear what you are asking. Please read [how to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and edit your post correspondingly. – MaxU - stand with Ukraine Apr 22 '17 at 18:32

1 Answers1

0

Maybe this will help you resolve your issue as you need to convert your dates into pandas datetime:

import pandas as pd

df = pd.DataFrame({'Col1': list('ABC'), 'Col2': ['1/1/2017', '1/2/2017', '1/3/2017'], 'Col3': ['1/1/2016', '1/2/2016', '1/3/2016']})
df[['Col2', 'Col3']] = df[['Col2', 'Col3']].apply(lambda x: pd.to_datetime(x))
df['Col4'] = df['Col2'] - df['Col3']

This is just an example as you didn't share any actual data.

zipa
  • 27,316
  • 6
  • 40
  • 58