0

I ve done a groupby('column_name',axis=1).agg(sum) on my dataframe and it is 20000 lines.

Edit: I would like to see the index of the lines that are duplicated before the groupby only, ie all the lines that have a similar value in the groupby column. How can i do that?

(Im asking because i get a warning and i would like to check the result of my groupby

FutureWarning: Dropping invalid columns in DataFrameGroupBy.add is deprecated. In a future version, a TypeError will be raised. Before calling .add, select only columns which should be valid for the function.)

I have tried to search the forum and google but i only get results of groupby index.

  • Please add sample example of data and what are you grouping by and what do you want after group by? – SomeDude May 22 '22 at 23:32
  • the data is 20000 lines...i only ask for a command to get the index of resulting merged rows when doing pd.groupby('column_name').agg(sum) or agg(mean) –  May 23 '22 at 14:15
  • It's not clear what "get the index of resulting merged rows" means (`pd.groupby('column_name').agg(sum).index`??). A small example with a subset of the data would probably help clarify that and motivate someone to post an answer. – fsimonjetz May 23 '22 at 14:48
  • pd.groupby('column_name').agg(sum).index gives the index of the whole dataframe, i want the index of the resulting grouped rows only, i dont see what is difficult to understand here –  May 24 '22 at 05:07
  • No, `df.index` gives the index of the whole dataframe, `df.groupby('...').sum().index` gives the index of the aggregated result. If that's not what you mean, please help me understand because I genuinely do not get it. Normally, pandas questions get an answer within minutes – if clearly written (cf. [this post](https://stackoverflow.com/a/20159305/15873043)). – fsimonjetz May 24 '22 at 07:07
  • I ve just tried and i get the index of all the rows, not only the duplicated rows that get grouped. i need the index of only the duplicated rows before they are grouped –  May 24 '22 at 09:41
  • I managed my way with set_index() and index.duplicated() ... –  May 24 '22 at 10:00

0 Answers0