0

I want to filter a numpy array by multiple conditions. I found this thread and tested the slicing method on my dataset but I get unexpected results. Well, at least for me they're unexpected, as I might simply have a problem understanding the functionality of bitwise operators or something else :/

Just so you get an idea of the data:

test.shape
>>(222988, 2)

stats.describe(all_output[:, 0])
>>DescribeResult(nobs=222988, minmax=(2.594e-05, 74.821), mean=11.106, variance=108.246, [...])

stats.describe(all_output[:, 1])
>>DescribeResult(nobs=222988, minmax=(0.001, 8.999), mean=3.484, variance=7.606, [...])

Now, doing some basic filtering:

test1 = test[(test[:, 0] >= 30) & (test[:, 1] <= 2)] 

test1.shape
>>(337, 2)

Those are actually the rows I don't want to have in my dataset, so if I do what I believe is the opposite...

test2 = test[(test[:, 0] <= 30) & (test[:, 1] >= 2)] 

test2.shape
>>(112349, 2)

I would expect the result to be (222651, 2). I guess that I'm doing some embarrassingly simple thing wrong? Can anyone here push me in the right direction?

Thanks already! -M

maawoo
  • 25
  • 4

1 Answers1

0

De morgans law: not (p and q) == (not p) *or* (not q). Anyway, the not operator in numpy is ~ so

 ~((test[:, 0] >= 30) & (test[:, 1] <= 2)) == ((test[:, 0] < 30) | (test[:, 1] > 2))

Either will do what you want, e.g.

test1 = test[~((test[:, 0] >= 30) & (test[:, 1] <= 2))]
FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
  • Thank you! Now I feel really stupid, because I even used ~ in my code before. But thanks for giving me some context about De morgans law! – maawoo Jun 14 '18 at 08:30