6

I have a dataframe and I would like to groupby by bq_market_id and then check if there is any NaN values in bq_back_price in each group if yes then True per group if no then False per group.

bq_selection_id bq_balance  bq_market_id  bq_back_price
0         45094462     185.04       7278437           1.97
1         45094462     185.04       7278437           1.97
2         45094463     185.04       7278437           3.05
3         45094463     185.04       7278437           3.05
4         45094464     185.04       7278437           5.80
5         45094464     185.04       7278437           5.80
6         45094466     185.04       7278437         200.00
7         45094466     185.04       7278437         200.00
8         45094465     185.04       7278437            NaN
9         45094465     185.04       7278437            NaN

How do i do this? I tried the following, but it did not work.

bb.groupby('bq_market_id')['bq_back_price'].isnull().any()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Arthur Zangiev
  • 1,438
  • 3
  • 14
  • 21

2 Answers2

8

I think you can use apply:

print bb.groupby('bq_market_id')['bq_back_price'].apply(lambda x: x.isnull().any())
bq_market_id
7278437    True
Name: bq_back_price, dtype: bool

Sample (some values in column bq_market_id are changed):

print bb
   bq_selection_id  bq_balance  bq_market_id  bq_back_price
0         45094462      185.04             1           1.97
1         45094462      185.04             1           1.97
2         45094463      185.04             1           3.05
3         45094463      185.04       7278437           3.05
4         45094464      185.04       7278437           5.80
5         45094464      185.04       7278437           5.80
6         45094466      185.04       7278437         200.00
7         45094466      185.04       7278437         200.00
8         45094465      185.04       7278437            NaN
9         45094465      185.04       7278437            NaN

print bb.groupby('bq_market_id')['bq_back_price'].apply(lambda x: x.isnull().any())
bq_market_id
1          False
7278437     True
Name: bq_back_price, dtype: bool
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

I would put the isnull() function before the groupby and use the max function to know whether this is null or not.

bb.isnull().groupby('bq_market_id').max()