2

I am grouping by panda frame based on two columns and adding the sum of a third column to this groupby object. It works perfectly, but I am trying to figure out how should I iterate through this returned object.

Below is the command I used to groupby and sum.

groupby_abundance=df.groupby(['mi','Organism_name'])[['Rep_Abun']].sum()   
print groupby_abundance

This is the output from above print.

                         Rep_Abun
mi     Organism_name          
mi1023 Daylily               9
mi1030 Maize                642
        Kniphofia            17
        Nymphaea            133
mi1083 Liriope               6
mi1097 Kniphofia             6
mi1098 Coconut              95

I want to iterate through groupby_abundance.

Thanks for helping in advance.

parth patel
  • 435
  • 2
  • 6
  • 15

2 Answers2

1

Here is one way. I have used minimal data, but the logic can be extended for your use case.

import pandas as pd

df = pd.DataFrame([['a', 1], ['b', 2], ['a', 3], ['c', 4], ['d', 5],
                   ['a', 6], ['e', 7], ['b', 8], ['d', 9], ['c', 10]],
                  columns=['A', 'B'])

g = df.groupby('A')['B'].sum()

for idx in g.index:
    print(idx, g[idx])

# a 10
# b 10
# c 14
# d 14
# e 7
jpp
  • 159,742
  • 34
  • 281
  • 339
0

Edit: Here groupby_abundance = df.groupby(['mi','Organism_name'])

Every grouped data is a sub- dataframe where you can use all methods which can be used on a full dataframe.

## every group is a sub-data frame
## we can use all methods normally as we do on a data frame
## that's how you can access the columns

for index, group in groupby_abundance:
     print(group.head())
YOLO
  • 20,181
  • 5
  • 20
  • 40