6
vagrant@ubuntu-xenial:~/lb/f5/v12$ python
Python 2.7.12 (default, Nov 12 2018, 14:36:49)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> data = [{'name': 'bob', 'age': 20}, {'name': 'jim', 'age': 25}, {'name': 'bob', 'age': 30}]
>>> df = pd.DataFrame(data)
>>> df.set_index(keys='name', drop=False, inplace=True)
>>> df
      age name
name
bob    20  bob
jim    25  jim
bob    30  bob
>>> df.to_dict(orient='index')
{'bob': {'age': 30, 'name': 'bob'}, 'jim': {'age': 25, 'name': 'jim'}}
>>>

If we convert the dataframe to a dictionary, the duplicate entry (bob, age 20) is removed. Is there any possible way to produce a dictionary whose values are a list of dictionaries? Something that looks like this?

{'bob': [{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}], 'jim': [{'age': 25, 'name': 'jim'}]}
cs95
  • 379,657
  • 97
  • 704
  • 746
Nolan
  • 363
  • 2
  • 10
  • possible solution can be foud here: https://stackoverflow.com/questions/10664856/make-dictionary-with-duplicate-keys-in-python – cors Jan 10 '19 at 22:45

1 Answers1

11

It should be possible to do this if you group on the index.

groupby Comprehension

{k: g.to_dict(orient='records') for k, g in df.groupby(level=0)}
# {'bob': [{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}],
#  'jim': [{'age': 25, 'name': 'jim'}]}

Details
groupby allows us to partition the data based on unique keys:

for k, g in df.groupby(level=0):
    print(g, end='\n\n')

      age name
name          
bob    20  bob
bob    30  bob

      age name
name          
jim    25  jim

For each group, convert this into a dictionary using the "records" orient:

for k, g in df.groupby(level=0):
    print(g.to_dict('r'))

[{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}]
[{'age': 25, 'name': 'jim'}]

And have it accessible by the grouper key.


GroupBy.apply + to_dict

df.groupby(level=0).apply(lambda x: x.to_dict('r')).to_dict()
# {'bob': [{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}],
#  'jim': [{'age': 25, 'name': 'jim'}]}

apply does the same thing that the dictionary comprehension does—it iterates over each group. The only difference is apply will require one final to_dict call at the end to dictify the data.

Community
  • 1
  • 1
cs95
  • 379,657
  • 97
  • 704
  • 746
  • 2
    You sir, are a wizard and scholar. Thank you so much! I had been struggling with this for hours. Could I get a breakdown of these two approaches? – Nolan Jan 10 '19 at 22:47