So I am working on a project and would like to use/learn pandas since it seems very powerful for what I am trying to do....
CONSTRAINTS I cannot change the fact that I have python code reading in multiple files to construct the dictionary. This is how is has to be and I must live with it and create the dataframe from that dictionary of lists.
Problem 1: I am getting dictionary passed to me. The dictionary contains multiple numerical items. Each item contains a list of data. This data has a time field that I am trying to compute the difference of between each row. Right now I am reading in a single numerical item, transposing it to get it into the correct structure and then trying to perform the .diff() operation on it... For example...
'400' : [ [ {'IDs'} ], {'TIME COL'}, ... ]
I then do the following command...
df = pd.DataFrame(myData['400'])
which I then do the transpose of such that all my time's are now in rows and not in columns.
Question 1: Is this the correct way of doing it? Since I want to take the difference of the time, I was reading that the diff works on rows and not columns. If this is not correct how should I be doing it?
Problem 2: So for now the transpose works... However, when I try to do the diff operation on any column I get the following error...
TypeError: unsupported operand type(s) for -: 'str' and 'str'
Question 2: Am I reading the data in wrong? Should I be doing something else to convert it such that it recognizes it as a integer dataset so I can do this operation?
Any insight is welcome!