1

I have a following DataFrame:

import pandas as pd
stuff = [
    {"num": 4, "id": None},
    {"num": 3, "id": "stuff"},
    {"num": 6, "id": None},
    {"num": 8, "id": "other_stuff"},
]
df = pd.DataFrame(stuff)

I need to select rows where "num" is higher than a given number but only if "id" is not None:

This doesn't have any effect:

df = df.loc[df["num"] >= 5 & ~pd.isnull(df["id"])]

What I need is something like this (presudocode):

df = df.loc[
    if ~pd.isnull(df["id"]):
       if df["num"] >= 5:
          select row
]

The expected result:

>>> df
    id        num
 1  stuff       3
 2  None        6
 3  other_stuff 8

Any help appreciated.

Ivan Bilan
  • 2,379
  • 5
  • 38
  • 58

1 Answers1

2

Add parantheses (because priority operators) with | for bitwise OR instead & for bitwise AND, also for inverted pd.isnull is possible use notna or notnull for oldier pandas versions:

df = df[(df["num"] >= 5) | (df["id"].notna())]
print (df)
   num           id
1    3        stuff
2    6         None
3    8  other_stuff
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252