2

I have a dataframe that looks like this:

d = {'country': ['America','America','America','America','Canada','Canada','Canada','Canada'],\
     'city': ['New York','New York','San Francisco','San Francisco',u'Montréal',u'Montréal','Toronto','Toronto'],\
     'landmark': ['Statue of Liberty', 'Empire State Building','Golden Gate Bridge',\
                  'Mission District','Biodome', 'Parc Laurier', 'CN Tower', 'Royal Ontario Museum']}
pd.DataFrame(data = d)

I want it to be a dict like this:

all_options = {
    'America': {
        'New York': ['Statue of Liberty', 'Empire State Building'],
        'San Francisco': ['Golden Gate Bridge', 'Mission District'],
    },
    'Canada': {
        u'Montréal': ['Biodome', 'Parc Laurier'],
        'Toronto': ['CN Tower', 'Royal Ontario Museum'],
    }
}   
blabbath
  • 442
  • 7
  • 27
  • This may be useful for you: https://stackoverflow.com/questions/10373660/converting-a-pandas-groupby-output-from-series-to-dataframe/32307259#32307259 – dbz Jun 04 '19 at 07:21
  • Also this one: https://www.geeksforgeeks.org/python-pandas-dataframe-groupby/ – dbz Jun 04 '19 at 07:25

1 Answers1

4

You can do with a dict comprehension like:

all_options = {country: grp.groupby('city')['landmark'].apply(list).to_dict()
               for country, grp in df.groupby('country')}

[out]

{'America': {'New York': ['Statue of Liberty', 'Empire State Building'],
  'San Francisco': ['Golden Gate Bridge', 'Mission District']},
 'Canada': {'Montréal': ['Biodome', 'Parc Laurier'],
  'Toronto': ['CN Tower', 'Royal Ontario Museum']}}

Or if you prefer the more explicit approach, this is equivalent to the for loop:

all_options = {}

for country, grp in df.groupby('country'):
    all_options[country] = grp.groupby('city')['landmark'].apply(list).to_dict()

Useful links for the above include, DataFrame.groupby, Series.apply and Series.to_dict

Chris Adams
  • 18,389
  • 4
  • 22
  • 39