Append data to original dataframe, but doing operations on groups

Question

I have a dataframe that has this kind of format:

Level  Nr.  quantity ....
0      ""     ""     ....
2      ""     ""     ....
2      ""     ""     ....
2      ""     ""     ....
2      ""     ""     ....
2      ""     ""     ....
0      ""     ""     ....
2      ""     ""     ....
2      ""     ""     ....
2      ""     ""     ....

Every "0,2,2..." block is a group, the 0 means a new group has to be created.

I was able to do that with:

grouped_df = df.groupby( df.level.eq(0).cumsum())

This creates a DataFrameGroupBy and every group ends before the 0.

Now I want to do operations on single groups (i.e. count number of appearances of a specific string) and append the result in a new column in the original data frame, but i get this error:

<ipython-input-9-20282836353b>:70: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

I tried this:

for key, item in grouped_df:  

    mygroup = grouped_df.get_group(key)

    mygroup['new column'] = ( (mygroup['nr.'] == "stringtobematched").sum() )

Can somebody help me? I'm sure there's a simple way to do this, but as a newbie with Pandas I have no idea.

It would help if you provided a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) with sample input and desired output (also see [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391)). For help with debugging, it's also important to show the full error traceback message. — AlexK, Feb 21 '23 at 01:17
Having said that, if you are trying to assign the same value to every row in a group (that's as well as I can understand your goal), look into the [groupby transform](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.DataFrameGroupBy.transform.html) operation. — AlexK, Feb 21 '23 at 01:17

score 0 · Answer 1 · answered Feb 20 '23 at 23:08

In case you want to group the data by Level, you can broadcast any available function on your pandas grouped dataframe by calling the agg method which stands for aggregate.

df.groupby(['Level']).agg(['sum'])

inside the list that contains sum you can add other functions like count, mean, etc.

Append data to original dataframe, but doing operations on groups

1 Answers1