Losing keys in pandas dataframe when after using groupby

Question

I'm pretty new to Python and I just encountered a problem.

mini_agg is my original pandas.dataframe and I'm trying to group it by 2 columns.

trial = mini_agg.groupby(['date','product','product_type_1','product_type_2','product_type_3','product_type_4']).sum()

print mini_agg.shape
print trial.shape

output:

(2965909, 10)
(499281, 4)

Furthermore I cannot access the keys by which I grouped by. In R I do obtain my column back when using aggregate.

Can you please help me? Thank you in advance

Please include the mini_agg values to your provided code – Kian Oct 20 '16 at 13:08 — Kian, Oct 20 '16 at 13:08

score 1 · Accepted Answer · edited May 23 '17 at 11:53

1

How to GroupBy a Dataframe in Pandas and keep Columns

Just found the answer I didn't find with my previous queries:

trial = mini_agg.groupby(['date','product','product_type_1','product_type_2','product_type_3','product_type_4']).sum().reset_index()

It is sufficient to add .reset_index()

edited May 23 '17 at 11:53

Community

1
1

answered Oct 20 '16 at 10:23

Tommaso Guerrini

1,499
5
17
33

Kian · Answer 2 · 2016-10-20T13:30:36.330

I expected mini_agg values to be provided however I suppose it's a combination of two one-dimensional labeled data structures. So as you mentioned mini_agg is a pandas.dataframe and as you must know DataFrame Like Series has a possibility to accept another DataFrame as input:

Therefore, If mini_agg to be like:

import pandas as pd
FRAME= {'one' : pd.Series([1., 2., 3.], index=['product_type_1', 'product_type_2', 'product_type_3']),
'two' : pd.Series([1., 2., 3., 4.], index=['product_type_1', 'product_type_2', 'product_type_3', 'product_type_4'])}
mini_agg = pd.DataFrame(FRAME)

So,

trial = pd.DataFrame(mini_agg, index=['date','product','product_type_1','product_type_2','product_type_3','product_type_4'], columns=['A', 'B', 'C', 'D', 'E', 'F'])

Losing keys in pandas dataframe when after using groupby

2 Answers2