0

example dictionary:

sample_dict = {'doctor': {'docter_a': 26, 'docter_b': 40, 'docter_c': 42}, 
               'teacher': {'teacher_x': 21, 'teacher_y': 45, 'teacher_z': 33}}

output dataframe:

job     person     age
doctor |doctor_a | 26
doctor |doctor_b | 40
doctor |doctor_c | 42
teacher|teacher_x| 21
teacher|teacher_y| 45
teacher|teacher_z| 33

I have tried:

df = pd.dataFrame.from_dict(sample_dict)

=>

             doctor      teacher
doctor_a  |  26      |   Nah
doctor_b  |  40      |   Nah
doctor_c  |  42      |   Nah
teacher_x |  Nah     |   21
teacher_y |  Nah     |   45
teacher_z |  Nah     |   33

Could someone help me figure this out?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • Is `dataFrame` a typo? It should be `DataFrame`, no? Same for `Nah` vs `NaN`. – wjandrea Nov 16 '22 at 19:37
  • `docter` is a typo, and it doesn't match between input and output – wjandrea Nov 16 '22 at 19:37
  • Related: [Nested dictionary to multiindex dataframe where dictionary keys are column labels](/q/24988131/4518341). If you use the solution in the accepted answer and transpose after, you get pretty much what you want, just need to reset index and rename. – wjandrea Nov 16 '22 at 19:48

2 Answers2

4

Use a nested list comprehension:

pd.DataFrame([[k1, k2, v]
              for k1,d in sample_dict.items() 
              for k2,v in d.items()],
             columns=['job', 'person', 'age'])

Output:

       job     person  age
0   doctor   docter_a   26
1   doctor   docter_b   40
2   doctor   docter_c   42
3  teacher  teacher_x   21
4  teacher  teacher_y   45
5  teacher  teacher_z   33
mozway
  • 194,879
  • 13
  • 39
  • 75
1

You can construct a zip of length 3 elements, and feed them to pd.DataFrame after reshaping:

zip_list = [list(zip([key]*len(sample_dict['doctor']), 
                 sample_dict[key], 
                 sample_dict[key].values())) 
            for key in sample_dict.keys()]

col_len = len(sample_dict['doctor']) # or use any other valid key
output = pd.DataFrame(np.ravel(zip_list).reshape(col_len**2, col_len))
Nuri Taş
  • 3,828
  • 2
  • 4
  • 22