3

Hi I have a dictionary of dictionaries as shown (copied example). I would like to convert it into a tall DataFrame.

Example dictionary:

{'Jill': {'Avenger: Age of Ultron': 7.0,
         'Django Unchained': 6.5,
         'Gone Girl': 9.0,
         'Kill the Messenger': 8.0},
 'Toby': {'Avenger: Age of Ultron': 8.5,
          'Django Unchained': 9.0,
          'Zoolander': 2.0}}

Expected tall dataframe look like below:

   Column1        Column2              Column3
    Jill     Avenger: Age of Ultron      7.0
    Jill     Django Unchained            6.5
    Jill     Gone Girl                   9.0
    Jill     Kill the Messenger          8.0
    Toby     Avenger: Age of Ultron      8.5
    Toby     Django Unchained            9.0
    Toby     Zoolander                   2.0

I tried the below, it gives me main keys of dictionaries as first column, but couldn't make it as tall dataframe:

pd.DataFrame.from_dict(d, orient='index')
Vamsi Nimmala
  • 497
  • 1
  • 7
  • 19
  • You could use a simple dictionary comprehension to parse this starting with a concat statement `pd.concat({k: pd.DataFrame(v).T for k, v in yourDict.items()}, axis=0)` – Umar.H Mar 13 '19 at 22:22

5 Answers5

2
d = {'Jill': {'Avenger: Age of Ultron': 7.0,
                            'Django Unchained': 6.5,
                            'Gone Girl': 9.0,
                            'Kill the Messenger': 8.0},
'Toby': {'Avenger: Age of Ultron': 8.5,
                                'Django Unchained': 9.0,
                                'Zoolander': 2.0}}

df = pd.DataFrame.from_dict(d).reset_index()
df = pd.melt(df, id_vars=["index"], 
                  var_name="By", value_name="Score").dropna()
mujjiga
  • 16,186
  • 2
  • 33
  • 51
0

A rather simple way would be to convert the dictionnary to a list of lists and feed the dataframe with that:

data = [[x, y, z] for x,v in d.items() for y, z in v.items()]
df = pd.DataFrame(data, columns=['Column' + str(i) for i in range(1,4)])

It gives:

  Column1                 Column2  Column3
0    Jill  Avenger: Age of Ultron      7.0
1    Jill        Django Unchained      6.5
2    Jill               Gone Girl      9.0
3    Jill      Kill the Messenger      8.0
4    Toby  Avenger: Age of Ultron      8.5
5    Toby        Django Unchained      9.0
6    Toby               Zoolander      2.0
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
0
ratings = {'Jill': {'Avenger: Age of Ultron': 7.0,
                    'Django Unchained': 6.5,
                    'Gone Girl': 9.0,
                    'Kill the Messenger': 8.0},
           'Toby': {'Avenger: Age of Ultron': 8.5,
                    'Django Unchained': 9.0,
                    'Zoolander': 2.0}}
values = [[name, movie, rating] for name, r in ratings.items() for movie, rating in r.items()] 
df = pd.dataframe(values)
BallpointBen
  • 9,406
  • 1
  • 32
  • 62
0

Just pass to the DataFrame , all the rest we handle in pandas

pd.DataFrame(ratings).reset_index().melt('index').dropna()
Out[118]: 
                    index variable  value
0  Avenger: Age of Ultron     Jill    7.0
1        Django Unchained     Jill    6.5
2               Gone Girl     Jill    9.0
3      Kill the Messenger     Jill    8.0
5  Avenger: Age of Ultron     Toby    8.5
6        Django Unchained     Toby    9.0
9               Zoolander     Toby    2.0
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Another approach, using the pandas.DataFrame constructor on the dict as is, then stacking, renaming and sorting the axis and columns:

df = (pd.DataFrame(d)
      .stack()
      .reset_index()
      .rename({'level_0': 'Column2', 'level_1': 'Column1', 0: 'Column3'}, axis=1)
      .sort_index(1).sort_values('Column1'))

print(df)


  Column1                 Column2  Column3
0    Jill  Avenger: Age of Ultron      7.0
2    Jill        Django Unchained      6.5
4    Jill               Gone Girl      9.0
5    Jill      Kill the Messenger      8.0
1    Toby  Avenger: Age of Ultron      8.5
3    Toby        Django Unchained      9.0
6    Toby               Zoolander      2.0
Chris Adams
  • 18,389
  • 4
  • 22
  • 39