0

I have a dataframe df with a column "A". How do I choose a subset of df based on multiple conditions. I am trying:

train.loc[(train["A"] != 2) or (train["A"] != 10)]

The or operator doesnt seem to be working. How can I fix this? I got the error:

ValueError                                Traceback (most recent call last)
<ipython-input-30-e949fa2bb478> in <module>
----> 1 sub_train.loc[(sub_train["primary_use"] != 2) or (sub_train["primary_use"] != 10), "year_built"]

/opt/conda/lib/python3.6/site-packages/pandas/core/generic.py in __nonzero__(self)
   1553             "The truth value of a {0} is ambiguous. "
   1554             "Use a.empty, a.bool(), a.item(), a.any() or a.all().".format(
-> 1555                 self.__class__.__name__
   1556             )
   1557         )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
NoLand'sMan
  • 534
  • 1
  • 3
  • 17

2 Answers2

2

Use | for bitwise OR or & for bitwise AND, also loc is not necessary:

#filter 2 or 10
train[(train["A"] == 2) | (train["A"] == 10)]
#filter not 2 and not 10
train[(train["A"] != 2) & (train["A"] != 10)]

If want also select some columns then is necessary:

train.loc[(train["A"] == 2) | (train["A"] == 10), 'B']

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

you need | instead of OR to do logic with Series:

train.loc[(train["A"] != 2) | (train["A"] != 10)]

To not worry about parentheses use Series.ne. loc here in principle is not necessary if you do not want to select a specific column:

train[train["A"].ne(2) | train["A"].ne(10)]

But I think your logic is wrong since this mask does not filter If the value is 2 it will not be filtered because it is different from 10 and vice versa. I think you wantSeries.isin + ~:

train[~train["A"].isin([2,10])]

or &

train[train["A"].ne(2) & train["A"].ne(10)]
ansev
  • 30,322
  • 5
  • 17
  • 31