Drop pandas rows where multiple_condition is true

Question

I've a df with 950 rows in it. Let's pretend that the columns are timestamp, quantity, event, file. This is a good approximation of df. I want to:

select all rows where event is this_event and file is this_file
and drop the rows if the row has the same timestamp as a row where file is my_file and the quantity's match.

How do I do that? Really struggling. I don't know how to manage this.

EDIT:

Example data:

timestamp,  event,  quantity,   file
2018-10-17 02:01:00,    slept,  7,  base
2018-10-17 02:01:00,    slept,  7,  temp
2018-10-17 02:01:00,    slept,  9,  base
2018-10-17 02:04:00,    studied,    5,  temp
2018-10-17 02:04:00,    farted, 7,  temp
2018-10-17 02:04:00,    drank,  1,  base
2018-10-17 02:04:00,    exercised,  8,  base
2018-10-17 02:04:00,    slept,  7,  base

So for example I will always keep records that pertain from file base. This is a bias I want to keep as these records cannot be removed. I want to delete any record from any other file that isn't base, e.g. here temp, where the timestamp and event is the same as any of those relating to base (at the same timestamp) but only when the quantity is the same as an entry from base (at the same timestamp).

So in this example data I would expect the code to identify the 2nd entry down and remove this because the quantity 7 is the same as one of the two other's of base.

The code would not delete anything from 02:04:00 because there are no rows with the same timestamp and event strings (events are all unique).

Could you show a small example of what you're trying to achieve? — razdi, Apr 26 '19 at 01:43
Possible duplicate of [Select rows from a DataFrame based on values in a column in pandas](https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas) — danielR9, Apr 26 '19 at 01:48
@uncle-junky, no not exactly, but it is very similar, and the top answer on that question will solve your problem (if you do your 2 steps one after another) — danielR9, Apr 26 '19 at 02:07
@macaw_9227 - I am checking and will update here if it solves it for me. — uncle-junky, Apr 26 '19 at 02:09
Other option is to do it with `groupby` on timestamp and quantity, and removeing rows from there — danielR9, Apr 26 '19 at 02:12

Yong Wang · Answer 1 · 2019-04-27T13:19:10.500

I think there are many way to solve it. here is my sample code for your reference.

import pandas as pd
from pandas.compat import StringIO
from pprint import pprint

text = """
timestamp,event,quantity,file
2018-10-17 02:01:00,slept,7,base
2018-10-17 02:01:00,slept,7,temp
2018-10-17 02:01:00,slept,9,base
2018-10-17 02:04:00,studied,5,temp
2018-10-17 02:04:00,farted,7,temp
2018-10-17 02:04:00,drank,1,base
2018-10-17 02:04:00,exercised,8,base
2018-10-17 02:04:00,slept,7,base
"""

df =pd.read_csv(StringIO(text))
print("original dataframe is")
pprint(df)

qty = 7
my_file = 'temp'
my_event = 'slept'
cols =["event","quantify","file"]
qty_check = grp["quantity"].transform(lambda x: x==qty)
evt_check = grp["event"].transform(lambda x: x==my_event)

file_check = grp["event"].transform(lambda x: x==my_file)
combined_check = evt_check & qty_check & file_check
print("\ndropped dataframe is")   
pprint(df[combined_check])

psl refer to pandas groupby/transform section as well. http://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#transformation note: you may find the filter functions as well. but as far as I know i can filter groups instead of the rows in groups

You're only filtering out lines with 'slept' and 'base'. This does not include the logic of what I described. The code would need `import datetime` in there to compare `timestamp` values, for example. Appreciate the effort though, just isn't what I'm looking for. — uncle-junky, Apr 26 '19 at 03:44
Yes. The original one not catch all logical. I updated already. — Yong Wang, Apr 26 '19 at 06:23
Is it possible to update the code with `groupby()` on `timestamp` and then perform the `combined_check`? — uncle-junky, Apr 27 '19 at 05:34

Drop pandas rows where multiple_condition is true

1 Answers1