27

I have a dataframe with a MultiIndex. I am wondering whether I created the data frame in the correct manner (see below).

             01.01  02.01  03.01  04.01
bar total1     40     52     18     11
    total2     36     85      5     92
baz total1     23     39     45     70
    total2     50     49     51     65
foo total1     23     97     17     97
    total2     64     56     94     45
qux total1     13     73     38      4
    total2     80      8     61     50

df.index.values results in:

array([('bar', 'total1'), ('bar', 'total2'), ('baz', 'total1'),
       ('baz', 'total2'), ('foo', 'total1'), ('foo', 'total2'),
       ('qux', 'total1'), ('qux', 'total2')], dtype=object)

df.index.get_level_values results in:

<bound method MultiIndex.get_level_values of MultiIndex(levels=[[u'bar', u'baz', u'foo', u'qux'], [u'total1', u'total2']],
           labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]],names=[]

I am ultimately looking to transform the df into a dict of dictionaries such that the first dict key are one of ['bar','baz', 'foo','qux'] and values are the dates and the inner dictionary is made of 'total1' and 'totals2' as key and the values are the integers of the df. Alternative explanation, is for example if dict1 is the dict then calling:

dict1['bar']

would result in the output:

{u'bar':{'01.01':{'total1':40,'total2':36},'02.01':{'total1':52,'total2':85},'03.01':{'total1':18,'total2':5},'04.01':{'total1':11,'total2':92} } }

How and what would I need to alter in order to achieve this? Is this an indexing issue?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
OAK
  • 2,994
  • 9
  • 36
  • 49
  • Did you try `df.to_dict()`? – albert Aug 21 '16 at 19:36
  • @albert yes and `df.to_dict()` results in: `{'01.01': {('bar', 'total1'): 40, ('bar', 'total2'): 36, ('baz', 'total1'): 23, ('baz', 'total2'): 50,`etc.. `df.to_dict('index')` results in: `('bar', 'total1'): {'01.08': 40, '02.08': 52, '03.08': 18, '04.08': 11}, ('bar', 'total2'): {'01.08': 36, '02.08': 85, '03.08': 5, '04.08': 92}`. So it gets me near where I want to be, that's why I thought it might be an issue with the formation of the dataframe. – OAK Aug 22 '16 at 06:40
  • You might take a look at the supported options/parameters for `df.to_csv()` since those let you manipulate the way the data is transformed: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_dict.html I'd suggest `df.to_csv('index')` but did not try it. – albert Aug 22 '16 at 07:24

1 Answers1

35

For converting whole dataframe to dictionary Try:

df.groupby(level=0).apply(lambda df: df.xs(df.name).to_dict()).to_dict()

{'bar': {'01.01': {'total1': 40, 'total2': 36},
  '02.01': {'total1': 52, 'total2': 85},
  '03.01': {'total1': 18, 'total2': 5},
  '04.01': {'total1': 11, 'total2': 92}},
 'baz': {'01.01': {'total1': 23, 'total2': 50},
  '02.01': {'total1': 39, 'total2': 49},
  '03.01': {'total1': 45, 'total2': 51},
  '04.01': {'total1': 70, 'total2': 65}},
 'foo': {'01.01': {'total1': 23, 'total2': 64},
  '02.01': {'total1': 97, 'total2': 56},
  '03.01': {'total1': 17, 'total2': 94},
  '04.01': {'total1': 97, 'total2': 45}},
 'qux': {'01.01': {'total1': 13, 'total2': 80},
  '02.01': {'total1': 73, 'total2': 8},
  '03.01': {'total1': 38, 'total2': 61},
  '04.01': {'total1': 4, 'total2': 50}}}

For converting one particular column, select before converting it to dictionary i.e

df.groupby(level=0).apply(lambda df: df.xs(df.name)[colname].to_dict()).to_dict()
Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108
piRSquared
  • 285,575
  • 57
  • 475
  • 624