0
df.Last_3mth_Avg.isnull().groupby([df['ShopID'],df['ProductID']]).sum().astype(int).reset_index(name='count')

The code above help me to see the number of null values by shopid and productid. Question is df.Last_3mth_Avg.isnull() becomes a series, how a groupby([df['ShopID'],df['ProductID']]) can be used afterwards?

I use the solution from: Pandas count null values in a groupby function

william007
  • 17,375
  • 25
  • 118
  • 194
  • [pd.Series.Groupby](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.groupby.html) .See docs. – Scott Boston Oct 23 '19 at 03:21
  • To add some context, the above construction allows you to make use of `Series.GroupBy.sum` which is implemented in cython and is extremely fast. This gets asked a lot, so you can see some simple timing difference here: https://stackoverflow.com/questions/57995951/pandas-count-nas-with-a-groupby-for-all-columns/57996118#57996118 – ALollz Oct 23 '19 at 03:49

2 Answers2

0

You should filter your df first:

df[df.Last_3mth_Avg.isnull()].groupby(['ShopID','ProductID']).agg('count')
Louis Ng
  • 533
  • 1
  • 7
  • 16
0

There are two ways to use groupby:

The common way is to use on the dataframe so you just mention the column names in the by= parameter

The second way is you apply on a series but use equal sized series in the by= parameter. This is rarely used and helps when you want to do convertions on a specific column and use groupby in the same line So, the above code line should work

Suraj Motaparthy
  • 520
  • 1
  • 5
  • 12