2

I'm working with 20 years of data. The important columns right now are YEAR, MONTH, NUM1, and NUM2. How can I get the monthly percent of NUM1/NUM2?

YEAR | MONTH | NUM1 | NUM2 |
------------------------------
2000 |  6    |  60  | 100  |
2000 |  6    |  55  | 100  |
2000 |  2    | 80    | 160 |

to

YEAR | MONTH | NUM1 | NUM2 |  PCT  |
-------------------------------------
2000 |  2    | 80    | 160 |  50   |
2000 |  6    |  60  | 100  |  57.5 |
2000 |  6    |  55  | 100  |  57.5 |

What I want to do is

df2 = df.groupby(['YEAR', 'MONTH'], as_index=False)
df2['PCT'] = df2['NUM1']/df2['NUM2']

However, I get TypeError: unsupported operand type(s) for /: 'DataFrameGroupBy' and 'DataFrameGroupBy'

When I run type(df2) it returns pandas.core.groupby.generic.DataFrameGroupBy. What am I doing wrong? How can I fix this? It should be pretty simple to order data in chronological order and then divide to columns.

Danlo9
  • 133
  • 3
  • 12

2 Answers2

1

You should add transform

g = df.groupby(['YEAR', 'MONTH'])
df['PCT']=g.NUM1.transform('sum')/g.NUM2.transform('sum')*100
df
   YEAR  MONTH  NUM1  NUM2   PCT
0  2000      6    60   100  57.5
1  2000      6    55   100  57.5
2  2000      2    80   160  50.0
BENY
  • 317,841
  • 20
  • 164
  • 234
0
import numpy as np
import pandas as pd

df = pd.DataFrame({'YEAR': [2000, 2001, 2000, 2000],
                   'MONTH': [2, 4, 6, 2],
                   'sales': [2000, 2001, 2000, 2003],
                   'price': [2020, 2200, 20020, 20023]})

#df.head()

#data.groupby('month', as_index=False).agg({"duration": "sum"})

state_office = df.groupby(['YEAR', 'MONTH']).agg({'sales':"sum","price":"sum"})

state_office['PCT'] = (state_office['sales']/state_office['price']).mul(100)
state_office.head()

You didnt define the operation to performed on the column Hope this answers your queries.

Related links - Pandas percentage of total with groupby

Code run
  • 165
  • 9