Convert DataFrameGroupBy object to DataFrame pandas

Question

I had a dataframe and did a groupby in FIPS and summed the groups that worked fine.

kl = ks.groupby('FIPS')

kl.aggregate(np.sum)

I just want a normal Dataframe back but I have a pandas.core.groupby.DataFrameGroupBy object.

The question title indicates that the question is about how to generally convert a groupby object back to a data frame, yet the question and the accepted answer are only about one special case (sum aggregation). Both the question and the accepted answer would be a lot more helpful if they were about how to generally convert a groupby object to a data frame, without performing any numeric processing on it. — Alex, Nov 07 '19 at 10:03
to get the groups as a dataFrame use something like this ks.groupby('FIPS').get_group("What ever the groupby values you have"). — mahmoh, May 27 '20 at 14:22

score 28 · Answer 1 · answered Mar 10 '18 at 09:55

28

 df_g.apply(lambda x: x)

will return the original dataframe.

answered Mar 10 '18 at 09:55

Tengfei Li

337
3
2

17

But why is this needed? – cs95 Jan 22 '19 at 03:27
this is still returns DFGroupby – hungryMind May 10 '20 at 12:41
@cs95 This is equivalent to `pd.DataFrame(grouped.groups)`. The `GroupBy.apply` function apply func to every group and combine them together in a `DataFrame`. – C.K. Aug 20 '20 at 07:14
2

@C.K. I understand that, thank you. However, my point was more about why we need this method to return the original DataFrame if df_g itself is the original DataFrame? If it's a question of what apply does and how to apply a function to every group, that's a discussion for another post. 2c – cs95 Aug 20 '20 at 08:07
1

@cs95 Yeap, you're right. I vote for your comment the first time I saw this answer, cause I thought there must be an easier way like `grouped.to_df()`. However, after I checked the API of the `GroupBy` object, I found there wasn't such a function, so I came back to tell everyone this is the easiest way to do that. lol. – C.K. Aug 20 '20 at 10:13
In answer to @cs95, I can only speak to why I sought out this question: This was necessary for me to find how a grouping changed the indices, or after grouping to visualize what has been condensed. Often times this comes up for me due to a heavily nested multiindex or when wanting to perform a group on a grouped df. I suppose this is a shortcut for slicing, but as a new user to multiindex slicing, it has been necessary to find my way. – double0darbo Jul 27 '21 at 15:37
I see that today (pd '1.5.3'), one should first add either 'group_keys=True' or 'group_keys=False' as an argument to groupby, before trying above. It is still the right answer IMHO. – Oren Apr 14 '23 at 17:48

score 24 · Accepted Answer · answered Nov 27 '12 at 11:16

24

The result of kl.aggregate(np.sum) is a normal DataFrame, you just have to assign it to a variable to further use it. With some random data:

>>> df = DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
>>>                         'foo', 'bar', 'foo', 'foo'],
...                  'B' : ['one', 'one', 'two', 'three',
...                         'two', 'two', 'one', 'three'],
...                  'C' : randn(8), 'D' : randn(8)})
>>> grouped = df.groupby('A')
>>> grouped
<pandas.core.groupby.DataFrameGroupBy object at 0x04E2F630>
>>> test = grouped.aggregate(np.sum)
>>> test
            C         D
A                      
bar -1.852376  2.204224
foo -3.398196 -0.045082

answered Nov 27 '12 at 11:16

joris

133,120
36
247
202

2

Actually, many of DataFrameGroupBy object methods such as (apply, transform, aggregate, head, first, last) return a DataFrame object. I used the method `filter` in [one](https://kenandeen.wordpress.com/2015/06/20/unisex-names-data-analysis-use-case/) of my blog posts. – Ken D Jun 20 '15 at 06:29
3

It's not a completely normal DataFrame. For example, if you try to call the .info() method on a GroupBy object, you get `AttributeError: Cannot access callable attribute 'info' of 'DataFrameGroupBy' objects, try using the 'apply' method.` – Adrian Keister Sep 10 '18 at 17:37
3

call .reset_index() to convert the grouped indices. – hungryMind May 10 '20 at 12:48
+1 @hungryMind - *that* is the answer. Re Joris answer - it may be a "dataframe" but it's not normal - you can see it has different column grouping of A vs C and D, which causes plots etc to fail when using as a normal dataframe. It needs collapsing with .reset_index() to make it proper! – TickboxPhil Jun 26 '21 at 14:20
kl.count() returns a DataFrame – vkt Mar 09 '22 at 15:51
There is what appears to be an undocumented property, `.obj`, which has the original object with grouped transformations applied. See https://stackoverflow.com/a/66879388/459863 A feature request with Pandas was also filed which remains open as of this writing: https://github.com/pandas-dev/pandas/issues/43902 – Wolfram Arnold Apr 27 '23 at 17:29

score 1 · Answer 3 · answered Aug 28 '20 at 08:02

1

Using pd.concat, just like this:

   pd.concat(map(lambda x: x[1], groups))

Or also keep index aligned:

   pd.concat(map(lambda x: x[1], groups)).sort_index()

answered Aug 28 '20 at 08:02

Rogers

81
1
5

score 0 · Answer 4 · edited Dec 01 '20 at 07:47

0

You can output the results of the groupby with a .head('# of rows')to a variable.

Ex: df2 = grouped.head(100)

Now you have a Pandas data frame "df2" with all your grouped data.

edited Dec 01 '20 at 07:47

Dr. Mantis Tobbogan

540
8
20

answered Nov 30 '20 at 21:15

andy

9
1

Convert DataFrameGroupBy object to DataFrame pandas

4 Answers4

Linked