0

I saw good posts that perfectly answer my title question (including this one), but I am in a more specific situation.

Let's say I have the following very simple DataFrame

df.head()

   param  accuracy
0    None       98
1    4.0        100
2    5.0        95
3    6.0        87
4    7.0        56
5    8.0        45
6    9.0        59
7    None       96
...

I would like to restrict my DataFrame to data where param is either None or 4. I tried the following technique

params = [None, 4]
df = df[df['param'].isin(params)]

which only selects data where param is 4.

This post shows how to filter None values with isnull() method, but it is not compatible with isin()... Hence my question.

YannTC
  • 33
  • 5
  • I dont think you can use ```isin``` in this scenario, as ```None``` == ```None``` will return False. Tthe isnull and isna methods should help in this case – sammywemmy Jun 08 '20 at 15:08
  • I think your problem is that the None value of the param attribute is defined as String. If you define it as None (nan) it will be erased with your code. If not the answer of @alexander is valid. – IMB Jun 08 '20 at 15:13
  • Thanks @IMB, it works if I use params=["None", 4]. I did not realize that my None were String.. – YannTC Jun 08 '20 at 15:18

3 Answers3

1

You can use "and" and "or" operations on the selectors and construct new ones. Would this help in your case?

params = [4]
df = df[df['param'].isin(params) | df['param'].isnull()]
Alexander Pivovarov
  • 4,850
  • 1
  • 11
  • 34
1

As pointed out by @IMB, a solution is to do params = ["None", 4] instead of params = [None, 4].

My dataframe was initially containing NaN, which I transformed into None with df = df.fillna('None'). Hence the String type.

YannTC
  • 33
  • 5
0

Try this:

df = df[ (df['param'] == 4) | (df['param'].isna()) ]