In given data frame I have these two columns:
neighbourhood_group
price
Price column contains all the prices for all neighbourhood_group:
neighbourhood_group price
0 Brooklyn 149
1 Manhattan 225
2 Manhattan 150
3 Brooklyn 89
4 Manhattan 80
5 Manhattan 200
6 Brooklyn 60
7 Manhattan 79
8 Manhattan 79
9 Manhattan 150
I am trying to detect outliers withing each neighbourhood_group.
The only idea I have come up to so far is to group by prices by neighbourhood_group, detect outliers within each group and create a mask for rows that needs to be dropped.
data.groupby('neighbourhood_group')['price']
I suspect there might be an easier solution for that.