6

I have the following dictionary.

d= {'key1': {'sub-key1': ['a','b','c','d','e']},
    'key2': {'sub-key2': ['1','2','3','5','8','9','10']}}

With the help of this post, I managed to successfully convert this dictionary to a DataFrame.

df = pd.DataFrame.from_dict({(i,j): d[i][j] 
                            for i in d.keys() 
                            for j in d[i].keys()},
                            orient='index')

However, my DataFrame takes the following form:

                  0  1  2  3  4     5     6
(key1, sub-key1)  a  b  c  d  e  None  None
(key2, sub-key2)  1  2  3  5  8     9    10

I can work with tuples, as index values, however I think it's better to work with a multilevel DataFrame. Post such as this one have helped me to create it in two steps, however I am struggling to do it in one step (i.e. from the initial creation), as the list within the dictionary as well as the tuples afterwards are adding a level of complication.

Newskooler
  • 3,973
  • 7
  • 46
  • 84
  • So you already have a working solution and would like to improve your code ? Please post your working solution, and use https://codereview.stackexchange.com/ – WNG Nov 21 '17 at 14:58
  • 2
    Use `df.index = pd.MultiIndex.from_tuples(df.index)` on what you've created already? – Zero Nov 21 '17 at 15:16
  • 1
    @Zero its been a long time seeing you. Where have you been ? – Bharath M Shetty Nov 21 '17 at 15:21

2 Answers2

18

I think you are close, for MultiIndex is possible used MultiIndex.from_tuples method:

d = {(i,j): d[i][j] 
       for i in d.keys() 
       for j in d[i].keys()}

mux = pd.MultiIndex.from_tuples(d.keys())
df = pd.DataFrame(list(d.values()), index=mux)
print (df)
               0  1  2  3  4     5     6
key1 sub-key1  a  b  c  d  e  None  None
key2 sub-key2  1  2  3  5  8     9    10

Thanks, Zero for another solution:

df = pd.DataFrame.from_dict({(i,j): d[i][j] 
                            for i in d.keys() 
                            for j in d[i].keys()},
                            orient='index')

df.index = pd.MultiIndex.from_tuples(df.index)
print (df)
               0  1  2  3  4     5     6
key1 sub-key1  a  b  c  d  e  None  None
key2 sub-key2  1  2  3  5  8     9    10
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3

I will using stack for two level dict....

df=pd.DataFrame(d)

df.T.stack().apply(pd.Series)
Out[230]: 
               0  1  2  3  4    5    6
key1 sub-key1  a  b  c  d  e  NaN  NaN
key2 sub-key2  1  2  3  5  8    9   10
BENY
  • 317,841
  • 20
  • 164
  • 234
  • @Bharath also...I just curious .....why we need modify the dict ....personally I do not like reconstruct the dict ,,, – BENY Nov 21 '17 at 15:55
  • maybe because its multiindex way of doing it, and it would be much faster than transpose and apply – Bharath M Shetty Nov 21 '17 at 15:57