1

I have a dataframe 'Top15' whose index is country :

Top15.index
Index(['China', 'United States', 'Japan', 'United Kingdom',
   'Russian Federation', 'Canada', 'Germany', 'India', 'France',
   'South Korea', 'Italy', 'Spain', 'Iran', 'Australia', 'Brazil'],
  dtype='object', name='Country')

I have a dictionary which has continent for each country.

ContinentDict  = {'China':'Asia', 
              'United States':'North America', 
              'Japan':'Asia', 
              'United Kingdom':'Europe', 
              'Russian Federation':'Europe', 
              'Canada':'North America', 
              'Germany':'Europe', 
              'India':'Asia',
              'France':'Europe', 
              'South Korea':'Asia', 
              'Italy':'Europe', 
              'Spain':'Europe', 
              'Iran':'Asia',
              'Australia':'Australia', 
              'Brazil':'South America'}

I want to group the countries by continent. I created a column 'Continent'

Top15['Continent']=Top15.index.map(ContinentDict)

After that i tried to group by continent

Top15.groupby('Continent')

I received following output:

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x0000023CBF691AC8>

When i checked the dataframe, it was not grouped. Is it because country is index and not in column? What should i do?

Ch3steR
  • 20,090
  • 4
  • 28
  • 58
nona
  • 27
  • 4
  • 1
    Unfortunately you can't do that. To get each each group `for k,g in Top15.groupby('Continent'): print(k);print(g)` – Ch3steR Jun 05 '20 at 19:23
  • 2
    The `GroupBy` object is just really a collection, it contains a mapping from the original DataFrame to your groups. You need to do something to that to get back a Series or DataFrame. – ALollz Jun 05 '20 at 19:23
  • You might find this useful: https://stackoverflow.com/questions/61810108/why-does-groupby-operations-behave-differently, also https://stackoverflow.com/questions/62092600/how-does-apply-work-on-a-pandas-dataframe/62092807#62092807 – ALollz Jun 05 '20 at 19:26

1 Answers1

0
from pandas import DataFrame as df
import numpy as np
import pandas as pd



ContinentDict  = {'China':'Asia', 
              'United States':'North America', 
              'Japan':'Asia', 
              'United Kingdom':'Europe', 
              'Russian Federation':'Europe', 
              'Canada':'North America', 
              'Germany':'Europe', 
              'India':'Asia',
              'France':'Europe', 
              'South Korea':'Asia', 
              'Italy':'Europe', 
              'Spain':'Europe', 
              'Iran':'Asia',
              'Australia':'Australia', 
              'Brazil':'South America'}


df = pd.DataFrame(list(ContinentDict.items()), columns=['Country', 'Continent'])
print(df)

"""
               Country      Continent
0                China           Asia
1        United States  North America
2                Japan           Asia
3       United Kingdom         Europe
4   Russian Federation         Europe
5               Canada  North America
6              Germany         Europe
7                India           Asia
8               France         Europe
9          South Korea           Asia
10               Italy         Europe
11               Spain         Europe
12                Iran           Asia
13           Australia      Australia
14              Brazil  South America

"""



df12 = (df.groupby('Continent').size().reset_index(name='Count')
        .sort_values(['Count'], ascending=False).rename(columns={'index': 'Continent'}))
print(df12)

"""
       Continent  Count
2         Europe      6
0           Asia      5
3  North America      2
1      Australia      1
4  South America      1

"""


df1 = df.groupby('Continent')['Country'].apply(lambda x: x.tolist())

print(df1)

"""
Continent
Asia                      [China, Japan, India, South Korea, Iran]
Australia                                              [Australia]
Europe           [United Kingdom, Russian Federation, Germany, ...
North America                              [United States, Canada]
South America                                             [Brazil]
Name: Country, dtype: object

"""
Soudipta Dutta
  • 1,353
  • 1
  • 12
  • 7