Pandas groupby on one column witout losing others columns?

Question

I have a problem with the groupby and pandas, at the beginning I have this chart :


import pandas as pd 

data = {'Code_Name':[1,2,3,4,1,2,3,4] ,'Name':['Tom', 'Nicko', 'Krish','Jack kr','Tom', 'Nick', 'Krishx', 'Jacks'],'Cat':['A', 'B','C','D','A', 'B','C','D'], 'T':[9, 7, 14, 12,4, 3, 12, 11]} 

# Create DataFrame 
df = pd.DataFrame(data) 
df

i have this : 
   Code_Name     Name Cat   T
0          1      Tom   A   9
1          2     Nick   B   7
2          3    Krish   C  14
3          4  Jack kr   D  12
4          1      Tom   A   4
5          2     Nick   B   3
6          3   Krishx   C  12
7          4    Jacks   D  11

Now i with groupby :

df.groupby(['Code_Name','Name','Cat'],as_index=False)['T'].sum()

i got this:
   Code_Name     Name Cat   T
0          1      Tom   A  13
1          2     Nick   B  10
2          3    Krish   C  14
3          3   Krishx   C  12
4          4  Jack kr   D  12
5          4    Jacks   D  11

But for me , i need this result :


   Code_Name   Name Cat   T
0          1    Tom   A  13
1          2   Nick   B  10
2          3  Krish   C  26
3          4   Jack   D  23

i don't care about Name the Code_name is only thing important for me with sum of T Thank's

Desired output is of `df.groupby('Cat')['T'].sum()` yet you are grouping by three columns. How do you want to handle different names (i.e. `Krish` vs `Krishx`)? — Chris, Nov 21 '19 at 10:52

score 0 · Answer 1 · edited Nov 21 '19 at 10:57

0

If you don't care about the other variable then just group by the column of interest:

gb = df.groupby(['Code_Name'],as_index=False)['T'].sum()
print(gb)

   Code_Name   T
0          1  13
1          2  10
2          3  26
3          4  23

Now to get your output, you can take the last value of Name for each group:

gb = df.groupby(['Code_Name'],as_index=False).agg({'Name': 'last', 'Cat': 'first', 'T': 'sum'})
print(gb)

0          1     Tom   A  13
1          2    Nick   B  10
2          3  Krishx   C  26
3          4   Jacks   D  23

edited Nov 21 '19 at 10:57

jezrael

822,522
95
1,334
1,252

answered Nov 21 '19 at 10:50

Horace

1,024
7
12

That's not working , because i'll lost my Name column and others columns – Walid Chiko Nov 21 '19 at 10:52
@jezrael I added it at the same time. Note that I used last instead of first, since it does not lead the required output. – Horace Nov 21 '19 at 10:55
Yes it was later but I wrote it without reading yours. Anyway, my bad for the `last` part, it seems that `first` works better. – Horace Nov 21 '19 at 10:57
yop, you are right, so rollback. If use some edit my name from your answer will be gone. – jezrael Nov 21 '19 at 10:59

jezrael · Accepted Answer · 2019-11-21T11:01:49.883

There is 2 ways - for each column with avoid losts add aggreation function - first, last or ', '.join obviuosly for strings columns and aggregation dunctions like sum, mean for numeric columns:

df = df.groupby('Code_Name',as_index=False).agg({'Name':'first', 'Cat':'first', 'T':'sum'})
print (df)
   Code_Name     Name Cat   T
0          1      Tom   A  13
1          2    Nicko   B  10
2          3    Krish   C  26
3          4  Jack kr   D  23

Or if some values are duplicated per groups like here Cat values add this columns to groupby - only order should be changed in output:

df = df.groupby(['Code_Name','Cat'],as_index=False).agg({'Name':'first', 'T':'sum'})
print (df)
   Code_Name Cat     Name   T
0          1   A      Tom  13
1          2   B    Nicko  10
2          3   C    Krish  26
3          4   D  Jack kr  23

Thank's , i just modified your code and it's working for me ! — Walid Chiko, Nov 21 '19 at 15:12

Renate van Kempen · Answer 3 · 2019-11-27T12:33:27.717

0

Perhaps you can try:

    (df.groupby("Code_Name", as_index=False)
       .agg({"Name":"first", "Cat":"first", "T":"sum"}))

see link: https://datascience.stackexchange.com/questions/53405/pandas-dataframe-groupby-and-then-sum-multi-columns-sperately for the original answer

edited Nov 27 '19 at 12:33

answered Nov 21 '19 at 11:29

Renate van Kempen

124
1
2
10

Pandas groupby on one column witout losing others columns?

3 Answers3