0

I've got a Pandas dataframe (v0.25.3, Python 3.6) and I'm trying to do operations on rows that match certain conditions. I've done this hundreds of times, but now I'm getting weird behavior that I can't figure out. Specifically, I've got two conditions, and I want to capture only rows where both conditions are True, but I'm getting rows in my results where either or both conditions are False.

For example,

print(data.loc[1,"var1"] != None)
print(data.loc[1,"var2"] != None)

returns False and True, but when I run

thisData1 = data.loc[((data["var1"] != None) & (data["var2"] != None))]
print(thisData1.head())

row 1 is still in there...all the data is still in there! If I use the older styling without .loc I get the same results. Row 0 is sill in there and they are both None. Furthermore, when I run just

print(len(data[data['var1'] != None]))

It again doesn't filter anything even though print(data.loc[1,"var1"] != None) => False

Everything here SEEMS to conform to the correct Pandas way to do this (e.g., see this question), and it usually works, but I can't see what I'm doing wrong in this case. Can anybody spot my error or recommend a way a different/safer way to run these filters? If the problem is my dataset, what should I check?

Aaron Bramson
  • 1,176
  • 3
  • 20
  • 34

1 Answers1

1

Use notnull instead of != None

thisData1 = data[data["var1"].notnull() & data["var2"].notnull()]
Lambda
  • 1,392
  • 1
  • 9
  • 11
  • Thank you. I got a work-around using `isinstance(x, str)` to work for my immediate issue, so I figured the problem was something about the `None`, but I didn't know the correct method in general...now I do. – Aaron Bramson May 22 '20 at 07:26