Keep columns after a groupby in an empty dataframe

Question

The dataframe is an empty df after query.when groupby,raise runtime waring,then get another empty dataframe with no columns.How to keep the columns?

df = pd.DataFrame(columns=["PlatformCategory","Platform","ResClassName","Amount"])
print df

result:

Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []

then groupby:

df = df.groupby(["PlatformCategory","Platform","ResClassName"]).sum()
df = df.reset_index(drop=False,inplace=True)
print df

result: sometimes is None sometime is empty dataframe

Empty DataFrame
Columns: []
Index: []

why empty dataframe has no columns.

runtimewaring:

/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: divide by zero encountered in log

if alpha + beta * ngroups < count * np.log(count):

/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: invalid value encountered in double_scalars
  if alpha + beta * ngroups < count * np.log(count):

cs95 · Accepted Answer · 2017-09-07T11:53:57.590

6

You need as_index=False and group_keys=False:

df = df.groupby(["PlatformCategory","Platform","ResClassName"], as_index=False).count()
df

Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []

No need to reset your index afterwards.

edited Sep 07 '17 at 11:53

answered Sep 07 '17 at 07:32

cs95

379,657
97
704
746

This was exactly what I was also looking for. Thanks a lot! – user2890059 Sep 07 '17 at 07:37
This only works for empty dataframe.In no empty dataframe case,it does't work.When change count() to sum() ,it does't work too.I want to get the sum compatible two cases .Have you some advice? – user2890059 Sep 07 '17 at 08:54
@user2890059 Share some data... in your question? – cs95 Sep 07 '17 at 08:56
@user2890059 If you are trying to find the sum of some particular column, then call sum() on that column. – cs95 Sep 07 '17 at 08:59
change to sum,get empty dataframe without columns – user2890059 Sep 07 '17 at 11:52
@user2890059 Interestingly, I don't think it's possible to do this with sum, because sum condenses all rows in an aggregation attempt. Try it with actual data and you'll understand. – cs95 Sep 07 '17 at 11:54

score 1 · Answer 2 · answered Aug 25 '21 at 13:52

Some code that works the same for .sum() whether or not the dataframe is empty:

def groupby_sum(df, groupby_cols):
    groupby = df.groupby(groupby_cols, as_index=False)
    summed = groupby.sum()
    return (groupby.count() if summed.empty else summed).set_index(groupby_cols)

df = groupby_sum(df, ["PlatformCategory", "Platform", "ResClassName"])

Keep columns after a groupby in an empty dataframe

2 Answers2

Linked