I have a Pandas dataframe as below. What I am trying to do is check if a station has variable yyy
and any other variable on the same day (as in the case of station1
). If this is true I need to delete the whole row containing yyy
.
Currently I am doing this using iterrows()
and looping to search the days in which this variable appears, changing the variable to something like "delete me", building a new dataframe from this (because pandas doesn't support replacing in place) and filtering the new dataframe to get rid of the unwanted rows. This works now because my dataframes are small, but is not likely to scale.
Question: This seems like a very "non-Pandas" way to do this, is there some other method of deleting out the unwanted variables?
dateuse station variable1
0 2012-08-12 00:00:00 station1 xxx
1 2012-08-12 00:00:00 station1 yyy
2 2012-08-23 00:00:00 station2 aaa
3 2012-08-23 00:00:00 station3 bbb
4 2012-08-25 00:00:00 station4 ccc
5 2012-08-25 00:00:00 station4 ccc
6 2012-08-25 00:00:00 station4 ccc