Groupby follow by series

Question

df.Last_3mth_Avg.isnull().groupby([df['ShopID'],df['ProductID']]).sum().astype(int).reset_index(name='count')

The code above help me to see the number of null values by shopid and productid. Question is df.Last_3mth_Avg.isnull() becomes a series, how a groupby([df['ShopID'],df['ProductID']]) can be used afterwards?

I use the solution from: Pandas count null values in a groupby function

[pd.Series.Groupby](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.groupby.html) .See docs. — Scott Boston, Oct 23 '19 at 03:21
To add some context, the above construction allows you to make use of `Series.GroupBy.sum` which is implemented in cython and is extremely fast. This gets asked a lot, so you can see some simple timing difference here: https://stackoverflow.com/questions/57995951/pandas-count-nas-with-a-groupby-for-all-columns/57996118#57996118 — ALollz, Oct 23 '19 at 03:49

score 0 · Answer 1 · answered Oct 23 '19 at 03:02

0

You should filter your df first:

df[df.Last_3mth_Avg.isnull()].groupby(['ShopID','ProductID']).agg('count')

answered Oct 23 '19 at 03:02

Louis Ng

533
1
7
16

score 0 · Answer 2 · answered Oct 23 '19 at 03:07

There are two ways to use groupby:

The common way is to use on the dataframe so you just mention the column names in the by= parameter

The second way is you apply on a series but use equal sized series in the by= parameter. This is rarely used and helps when you want to do convertions on a specific column and use groupby in the same line So, the above code line should work

Groupby follow by series

2 Answers2