I would like to store the summary of a local set of DataFrames into a "meta DataFrame" using pd.MultiIndex
.
Basically, row-axis has two levels, and column-axis also. In the class managing the set of DataFrames, I define as a class variable this "Meta DataFrame".
import pandas as pd
row_axis = pd.MultiIndex(levels=[[],[]], codes=[[],[]], names=['Data', 'Period'])
column_axis = pd.MultiIndex(levels=[[],[]], codes=[[],[]], names=['Data', 'Extrema'])
MD = pd.DataFrame(index=row_axis, columns=column_axis)
It seems to work.
MD.index
>>> MultiIndex([], names=['Data', 'Period'])
MD.columns
>>> MultiIndex([], names=['Data', 'Extrema'])
Now, each time I process an individual DataFrame id
, I want to update this "Meta DataFrame" accordingly. id
has a DateTimeIndex with period '5m'.
id.index[0]
>>> Timestamp('2020-01-01 08:00:00')
id.index[-1]
>>> Timestamp('2020-01-02 08:00:00')
I want to keep in MD its first and last index values for instance.
MD.loc[[('id', '5m')],[('Timestamp', 'First')]] = id.index[0]
MD.loc[[('id', '5m')],[('Timestamp', 'Last')]] = id.index[-1]
This doesn't work, I get following error message:
TypeError: unhashable type: 'list'
In the end, the result I would like is to have in MD following type of info (I am having other id
DataFrames with different periods) :
Timestamp
First Last
id 5m 2020-01-01 08:00:00 2020-01-02 08:00:00
10m 2020-01-05 08:00:00 2020-01-06 18:00:00
Ultimately, I will also keep min and max of some columns in id
.
For instance if id
has a column 'Temperature'.
Timestamp Temperature
First Last Min Max
id 5m 2020-01-01 08:00:00 2020-01-02 08:00:00 -2.5 10
10m 2020-01-05 08:00:00 2020-01-06 18:00:00 4 15
These values will be recorded when I record id
.
I am aware initializing a DataFrame cell per cell is not time efficient, but it will not be done that often.
Besides, I don't see how I can manage this organization of information in a Dict, which is why I am considering doing it with a multi-level DataFrame. I will then dump it in a csv file to store these "meta data".
Please, what is the right way to initialize each of these values in MD?
I thank you for your help! Bests,