8

My dictionary looks like this:

{'x': {'b': 10, 'c': 20}, 'y': {'b': '33', 'c': 44}}

I want to get a dataframe that looks like this:

index   col1    col2    val
0        x       b      10
1        x       c      20
2        y       b      33
3        y       c      44

I tried calling pandas.from_dict(), but it did not give me the desired result. So, what is the most elegant, practical way to achieve this?

EDIT: In reality, my dictionary is of depth 4, so I'd like to see a solution for that case, or ideally, one that would work for arbitrary depth in a general setup.

Here is an example of a deeper dictionary: {'x':{'a':{'m':1, 'n':2}, 'b':{'m':10, 'n':20}}, 'y':{'a':{'m':100, 'n':200}, 'b':{'m':111, 'n':222}} } The appropriate dataframe should have 8 rows.

ANSWER:

df = pd.DataFrame([(k1, k2, k3, k4, k5, v) for k1, k2345v in dict.items()
                           for k2, k345v in k2345v.items()
                           for k3, k45v in k345v.items()
                           for k4, k5v in k45v.items()
                           for k5, v in k5v.items()])
Baron Yugovich
  • 3,843
  • 12
  • 48
  • 76

3 Answers3

6

For any depth, you could use pd.json_normalize and melt. Below is an example with a slightly modified 2/3/4-deep dictionary

data = {'one': 1, 'two': {'a': 2}, 'four': {'a': {'b': {'c': 2}}},
        'x': {'a': {'m': 1, 'n': 2}, 'b': {'m': 10, 'n': 20}},
        'y': {'a': {'m': 100, 'n': 200}, 'b': {'m': 111, 'n': 222}}}
df_melt = pd.json_normalize(data, sep='>>').melt()
df_final = df_melt['variable'].str.split('>>', expand=True)
df_final.columns = [f'col{name}' for name in df_final.columns]
df_final[['value']] = df_melt['value']
    col0  col1  col2  col3  value
0    one  None  None  None      1
1    two     a  None  None      2
2   four     a     b     c      2
3      x     a     m  None      1
4      x     a     n  None      2
5      x     b     m  None     10
6      x     b     n  None     20
7      y     a     m  None    100
8      y     a     n  None    200
9      y     b     m  None    111
10     y     b     n  None    222

json_normalize is really useful and there are some additional examples on Medium

KyleKing
  • 129
  • 4
  • 11
5

You can use a list comprehension to reorder your dict into a list of tuples where each tuple is a row and then you can sort your dataframe

import pandas as pd

d = {'x': {'b': 10, 'c': 20}, 'y': {'b': '33', 'c': 44}}

df = pd.DataFrame([(k,k1,v1) for k,v in d.items() for k1,v1 in v.items()], columns = ['Col1','Col2','Val'])
print df.sort(['Col1','Col2','Val'], ascending=[1,1,1])

  Col1 Col2 Val
3    x    b  10
2    x    c  20
1    y    b  33
0    y    c  44
SirParselot
  • 2,640
  • 2
  • 20
  • 31
3

first create the df using from_dict, then call stack and reset_index to get the shape you desire, you then need to rename the cols, sort and reset the index:

In [83]:
d={'x': {'b': 10, 'c': 20}, 'y': {'b': '33', 'c': 44}}
df = pd.DataFrame.from_dict(d, orient='index').stack().reset_index()
df.columns = ['col1', 'col2', 'val']
df.sort_values(['col1', 'col2'], inplace=True)
df.reset_index(drop=True, inplace=True)
df

Out[83]:
  col1 col2 val
0    x    b  10
1    x    c  20
2    y    b  33
3    y    c  44
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • 1
    Is this going to work for more deeply nested dictionaries? – Baron Yugovich Nov 09 '15 at 15:02
  • Depends but I would say no as the ctor makes presumptions about the structure so you may need to unfold the dict first – EdChum Nov 09 '15 at 15:03
  • 1
    Can you please edit your response then to account for this? My dictionary has depth 4, i.e. (key1, key2, key3, key4, val), you get the idea. – Baron Yugovich Nov 09 '15 at 15:07
  • Sorry but adjusting your requirements to your *real* dataset and issue is frowned upon on SO. You should've posted that as your real question or you should post a new question as it wastes people's time answering questions that don't represent your real problem as you're essentially incrementing your requirements – EdChum Nov 09 '15 at 15:09
  • I apologize, I didn't emphasize in the original question that the dictionary can be of greater depth. The thing is, it is harder to type out an appropriate example, so I simplified it. I will edit my question here. – Baron Yugovich Nov 09 '15 at 15:11