if a row satisfies a condition drop it, else don't drop it

Question

import pandas as pd; import numpy as np;
df = {'id': [1, 2, 3, 4, 5, 6],
   'created_at': ['2020-02-01', '2020-02-02', '2020-02-02', '2020-02-02', '2020-02-03','2020-02-02'],
   'type': ['red', np.nan, np.nan, 'blue', 'yellow', np.nan]}

df = pd.DataFrame (df, columns = ['id', 'created_at','type'])

If created_at=2020-02-01 and id=2 or 6 drop row else don't drop. I want to obtain this output;

id	created_at	type
1	2020-02-01	red
3	2020-02-01	NaN
4	2020-02-01	blue
5	2020-02-03	yellow

i.e. I don't want to drop all rows with nan value.

if df['created_at'] == '2020-02-01' and df['id']== 2 or df['id'] == 6: — Piotr Żak, Nov 08 '22 at 10:08

mozway · Answer 1 · 2022-11-08T10:14:53.127

1

You would need to use boolean indexing:

# is the date 2020-02-01?
m1 = df['created_at'].eq('2020-02-01')
# is the id 2 or 6?
m2 = df['id'].isin([2, 6])

# keep if NOT both conditions are matched
out = df[~(m1&m2)]

Alternatively:

# is the date NOT 2020-02-01?
m1 = df['created_at'].ne('2020-02-01')
# is the id NOT 2 or 6?
m2 = ~df['id'].isin([2, 6])

# keep if either condition is matched
out = df[m1|m2]

edited Nov 08 '22 at 10:14

answered Nov 08 '22 at 10:07

mozway

194,879
13
39
75

df[~(m1&m2)] - its get the common part? out = df[m1|m2] – Piotr Żak Nov 08 '22 at 10:09
how those logic filtering subsets of variables? – Piotr Żak Nov 08 '22 at 10:09
@PiotrŻak I don't understand your question, do you mean how boolean indexing works? I added a link – mozway Nov 08 '22 at 10:14
yes, i met this concept first time, and want go a bit further for understanding that type of data operations. – Piotr Żak Nov 08 '22 at 10:16

if a row satisfies a condition drop it, else don't drop it

1 Answers1