1

I have a Pandas DataFrame that was generated based on a dictionary (that had some dictionaries inside it).

When I print the dictionary it gives me something like this illustration below:

Current Dataframe Table enter image description here

I need to, somehow, to transform those dictionaries into columns that are going to be children of the month (Jan, Feb, etc.) column. Like this:

Dataframe Table that I need enter image description here

**UPDATE - Adding the dictionary **

data={'2007': {'Jan': {'working_days': 23,'weekend': 4,'holydays': 4,'days': 31},'Feb': {'working_days': 20,'weekend': 6,'holydays': 2,'days': 28},'Mar': {'working_days': 20,'weekend': 6,'holydays': 2,'days': 28}},'2008': {'Jan': {'working_days': 23,'weekend': 4,'holydays': 4,'days': 31},'Feb': {'working_days': 20,'weekend': 6,'holydays': 2,'days': 28},'Mar': {'working_days': 20,'weekend': 6,'holydays': 2,'days': 28}},'2009': {'Jan': {'working_days': 23,'weekend': 4,'holydays': 4,'days': 31},'Feb': {'working_days': 20,'weekend': 6,'holydays': 2,'days': 28},'Mar': {'working_days': 20,'weekend': 6,'holydays': 2,'days': 28}}}

How can I do it?

Thanks in advance!

  • Why don't you concatenate the column names (e.g. `Jan_working_days`, `Jan_weekend`, ..., `Fev_working_days`) instead of having a double header? – GRoutar Jul 04 '19 at 14:41
  • Because my boss do not want me to do that. LoL But thanks for the suggestion. – Vinicius Correia Jul 04 '19 at 14:52
  • 1
    It looks like you want to build a MultiIndex for the columns. Certainly possible but I shall not post any answer without testing it and I cannot test if you do not provide a [mcve]. Maybe [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) could help to understand what I mean and why I need it. – Serge Ballesta Jul 04 '19 at 14:55
  • Here, I think that I need (an extract of) the initial dictionary. – Serge Ballesta Jul 04 '19 at 15:04
  • Added the data dict. Sorry for the delay guys. – Vinicius Correia Jul 05 '19 at 10:08

1 Answers1

1

If you start from your dataframe you can create a new one with the desired structure as follows:

month_labels=['Jan', 'Fev']
attributes=['working_days', 'weekend']
new_cols= pd.MultiIndex.from_arrays([month_labels, attributes]).sort_values()
new_cols= pd.MultiIndex.from_product([month_labels, attributes], names=['Month', 'Attribute'])
#df_new=pd.DataFrame(columns=pd.MultiIndex.from_arrays([month_labels, attributes]), index=df.index)
df_new=pd.DataFrame(columns=new_cols, index=df.index)
for month in df.columns:
    for attribute in attributes:
        df_new[(month, attribute)]= df[month].map(lambda d: d[attribute])

df_new

You just have to complete the lists of month_labels and attributes. The result looks as follows (with invented data --> my months usually have less than 32 days, too :-):

Month              Jan                  Fev        
Attribute working_days weekend working_days weekend
Year                                               
2007                19       9           19       8
2008                22      10           20       8
2009                19      10           18       8
2010                20      10           18       8
jottbe
  • 4,228
  • 1
  • 15
  • 31