Groupby in pandas by including the columns which are in group by condition

Question

using the example from Pandas sum by groupby, but exclude certain columns

Code   Country      Item_Code   Item    Ele_Code    Unit    Y1961    Y1962   Y1963
2      Afghanistan  15          Wheat   5312        Ha      10       20      30
2      Afghanistan  25          Maize   5312        Ha      10       20      30
4      Angola       15          Wheat   7312        Ha      30       40      50
4      Angola       25          Maize   7312        Ha      30       40      50

when we do the

df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum()

the output will be

                       Y1961  Y1962  Y1963
Country     Item_Code
Afghanistan 15            10     20     30
            25            10     20     30
Angola      15            30     40     50
            25            30     40     50

Now , here is my questions

when I do df.columns i will only get Y1961 Y1962 Y1963

But what if I want Country, Item_Code are included as columns like below

df.columns

Country, Item_Code ,Y1961  Y1962  Y1963

score 2 · Accepted Answer · answered Aug 07 '20 at 09:22

you need to specify the arg as_index=False

df.groupby(['Country', 'Item_Code'],as_index=False)[["Y1961", "Y1962", "Y1963"]].sum()


       Country  Item_Code  Y1961  Y1962  Y1963
0  Afghanistan         15     10     20     30
1  Afghanistan         25     10     20     30
2       Angola         15     30     40     50
3       Angola         25     30     40     50


df.columns

Index(['Code', 'Country', 'Item_Code', 'Item', 'Ele_Code', 'Unit', 'Y1961',
       'Y1962', 'Y1963'],
      dtype='object')

you could also do

df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum().reset_index()

It worked ! also I have used Dishin H Goyani code for combining them . — manda1238, Aug 07 '20 at 11:32

score 1 · Answer 2 · answered Aug 07 '20 at 09:22

1

Those are now index names. You can get that using df.index.names.

list(df.index.names) + df.columns.to_list()
['Country', 'Item_Code', 'Y1961', 'Y1962', 'Y1963']

answered Aug 07 '20 at 09:22

Dishin H Goyani

7,195
3
26
37

Groupby in pandas by including the columns which are in group by condition

2 Answers2