Grouping a running summary using pandas Dataframe Python

Question

I have a file full of data

Come up against an issue where i want to summarize results in a query.

Example being as follows:

NAME | FRUIT | PRICE 
JOHN | APPLE |  2
JOHN | APPLE |  2
JOHN | APPLE |  2
JOHN | APPLE |  2
JOHN | GRAPE |  3
DAVE | GRAPE |  3
DAVE | GRAPE |  3
DAVE | GRAPE |  3

This is my table at the moment, what i need though is to have a summary of Johns business, like below:

NAME | FRUIT      | PRICE 
JOHN | APPLE      |  2
JOHN | APPLE      |  2
JOHN | APPLE      |  2
JOHN | APPLE      |  2
JOHN | APPLETOTAL |  8 
JOHN | GRAPE      |  3
JOHN | GRAPETOTAL |  3
DAVE | GRAPE      |  3
DAVE | GRAPE      |  3
DAVE | GRAPE      |  3

Can this be done?

Possible duplicate of [Pandas group-by and sum](https://stackoverflow.com/questions/39922986/pandas-group-by-and-sum) — iacob, Jun 28 '18 at 13:37

score 1 · Answer 1 · answered Jun 28 '18 at 13:37

You can try this :

Create a summary dataframe

summary_df = df.groupby(["NAME","FRUIT"] )["PRICE"].sum()
Then mutate the FRUIT column like this

summary_df.FRUIT = summary_df.FRUIT.apply(lambda x : x + "TOTAL")
Finally add the dataframe back to the base df

df = pd.concat([df, summary_df])

You can sort this dataframe df by column FRUIT to get a sorted dataframe

score 1 · Answer 2 · answered Jun 28 '18 at 13:47

This should sort you out:

totals = df.groupby(['FRUIT', 'NAME']).sum().reset_index()
totals['FRUIT'] = totals['FRUIT'] + 'TOTAL'

result = pd.concat([df, totals]).sort_values(['NAME', 'FRUIT']).reset_index(drop=True)
result
#    NAME       FRUIT  PRICE
#0   DAVE       GRAPE      3
#1   DAVE       GRAPE      3
#2   DAVE       GRAPE      3
#3   DAVE  GRAPETOTAL      9
#4   JOHN       APPLE      2
#5   JOHN       APPLE      2
#6   JOHN       APPLE      2
#7   JOHN       APPLE      2
#8   JOHN  APPLETOTAL      8
#9   JOHN       GRAPE      3
#10  JOHN  GRAPETOTAL      3

Grouping a running summary using pandas Dataframe Python

2 Answers2