Gropuby agg says: ValueError: Must produce aggregated value

Question

I am trying to groupby a huge dataframe (3.5 Billion observations) by two columns and multiply the resulting columns 2 by 2 as follows:

FirstNeighborVars_s2=dat2.groupby(by=['NiuCust2', 'year']).agg(
    s2_nv_importing=('NVCost2_sum', lambda x: (x * dat2.loc[x.index, 'importing'])),
    s2_prop_importing=('PROPCost2_sum', lambda x: (x * dat2.loc[x.index, 'importing']))
).reset_index()

Now, while this works with a smaller version of the database (when dat2 is defined as dat2.head(10000)), this does not work with the version using the entire database (the one in the code above) giving the following error:

ValueError: Must produce aggregated value

Why does this error arise? Is there another way to perform the following series of operations (which does not work actually because in pandas we cannot operate on Groupby dataframes):

FirstNeighborVars_s2=dat2.groupby(by=['NiuCust2', 'year'])

FirstNeighborVars_s2["s2_nv_importing"]=FirstNeighborVars_s2["NVCost2_sum"]*FirstNeighborVars_s2["importing"]

FirstNeighborVars_s2["s2_prop_importing"]=FirstNeighborVars_s2["PROPCost2_sum"]*FirstNeighborVars_s2["importing"]

Thanks a lot

Does this answer your question? [Must produce aggregated value. I swear that I am](https://stackoverflow.com/questions/39840546/must-produce-aggregated-value-i-swear-that-i-am) — OCa, Aug 13 '23 at 22:29

Gropuby agg says: ValueError: Must produce aggregated value

0 Answers0