For the record, I've read these following threads but none of them seems to fulfill my need:
- Python pandas - filter rows after groupby
- Pandas get rows after groupby
- filter rows after groupby pandas
Say I have this following table df
:
user_id is_manually created_per_week
----------------------------------------
10 True 59
10 False 90
33 True 0
33 False 64
50 True 0
50 False 0
I want to exclude the users who have created nothing, i.e. created_per_week is 0 in both rows of is_manually True and False, which is user 50 in this case.
user_id is_manually created_per_week
----------------------------------------
10 True 59
10 False 90
33 True 0
33 False 64
I learned that df.groupby
doesn't have query
method and should use apply
instead.
The closest answer I've got is df.groupby("user_id").apply(lambda x: x[x["created_per_week"] > 0])
, but it also excludes the row of user 33 manually True, which is undesirable. I've also tried df.groupby("user_id").apply(lambda x: x[any(x["created_per_week"] > 0)])
but it returns a KeyError.
In other words, I am searching the equivalence of df %>% group_by(user_id) %>% filter(any(created_per_week > 0))
in R. Thanks.