How to build a MultiIndex Pandas DataFrame from a nested dictionary with lists

Question

I have the following dictionary.

d= {'key1': {'sub-key1': ['a','b','c','d','e']},
    'key2': {'sub-key2': ['1','2','3','5','8','9','10']}}

With the help of this post, I managed to successfully convert this dictionary to a DataFrame.

df = pd.DataFrame.from_dict({(i,j): d[i][j] 
                            for i in d.keys() 
                            for j in d[i].keys()},
                            orient='index')

However, my DataFrame takes the following form:

                  0  1  2  3  4     5     6
(key1, sub-key1)  a  b  c  d  e  None  None
(key2, sub-key2)  1  2  3  5  8     9    10

I can work with tuples, as index values, however I think it's better to work with a multilevel DataFrame. Post such as this one have helped me to create it in two steps, however I am struggling to do it in one step (i.e. from the initial creation), as the list within the dictionary as well as the tuples afterwards are adding a level of complication.

So you already have a working solution and would like to improve your code ? Please post your working solution, and use https://codereview.stackexchange.com/ — WNG, Nov 21 '17 at 14:58
Use `df.index = pd.MultiIndex.from_tuples(df.index)` on what you've created already? — Zero, Nov 21 '17 at 15:16
@Zero its been a long time seeing you. Where have you been ? — Bharath M Shetty, Nov 21 '17 at 15:21

jezrael · Accepted Answer · 2017-11-21T15:22:10.733

18

I think you are close, for MultiIndex is possible used MultiIndex.from_tuples method:

d = {(i,j): d[i][j] 
       for i in d.keys() 
       for j in d[i].keys()}

mux = pd.MultiIndex.from_tuples(d.keys())
df = pd.DataFrame(list(d.values()), index=mux)
print (df)
               0  1  2  3  4     5     6
key1 sub-key1  a  b  c  d  e  None  None
key2 sub-key2  1  2  3  5  8     9    10

Thanks, Zero for another solution:

df = pd.DataFrame.from_dict({(i,j): d[i][j] 
                            for i in d.keys() 
                            for j in d[i].keys()},
                            orient='index')

df.index = pd.MultiIndex.from_tuples(df.index)
print (df)
               0  1  2  3  4     5     6
key1 sub-key1  a  b  c  d  e  None  None
key2 sub-key2  1  2  3  5  8     9    10

edited Nov 21 '17 at 15:22

answered Nov 21 '17 at 15:03

jezrael

822,522
95
1,334
1,252

Sir I see @Zero's comment now. You can add his name if you update the answer – Bharath M Shetty Nov 21 '17 at 15:22
One improvement. Just `mux = pd.MultiIndex.from_tuples(d)`. Similar to how `for k in dictionary...` iterates over its keys not key/value pairs. – Brad Solomon Nov 21 '17 at 15:26
@Bharath just curious why we do not do modify within the dataframe ...rather than in the dict ... – BENY Nov 21 '17 at 15:52

score 3 · Answer 2 · answered Nov 21 '17 at 15:47

3

I will using stack for two level dict....

df=pd.DataFrame(d)

df.T.stack().apply(pd.Series)
Out[230]: 
               0  1  2  3  4    5    6
key1 sub-key1  a  b  c  d  e  NaN  NaN
key2 sub-key2  1  2  3  5  8    9   10

answered Nov 21 '17 at 15:47

BENY

317,841
20
164
234

@Bharath also...I just curious .....why we need modify the dict ....personally I do not like reconstruct the dict ,,, – BENY Nov 21 '17 at 15:55
maybe because its multiindex way of doing it, and it would be much faster than transpose and apply – Bharath M Shetty Nov 21 '17 at 15:57

How to build a MultiIndex Pandas DataFrame from a nested dictionary with lists

2 Answers2

Linked