0

I am trying to extract a series meeting multiple conditions in Pandas, i.e. using a boolean operator to filter the data, based on the question/answer here, but I need to use the bracket column notation. (Python 3.7)

This works, and returns [index, Boolean]:

mySeries = data['myCol'] == 'A'

These both return errors:

mySeries = (data['rank'] == 'A' or data['rank'] == 'B')
mySeries = (data['rank'] == 'A' | data['rank'] == 'B')

The second one returns ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). The answers in this question seem to address this error for a dataframe, not a series. The second attempt returns this error: TypeError: Cannot perform 'ror_' with a dtyped [object] array and scalar of type [bool]

I am using bracket notation df['rank'] instead of dot notation df.rank because in the dot notation, Pandas confuses the column name with the rank method.

a11
  • 3,122
  • 4
  • 27
  • 66

2 Answers2

1

We can just do isin

mySeries = (data['rank'].isin(['A','B'])
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Based on the answer by @unutbu here, this is the correct notation, the issue was that each condition needed to be in its own parentheses:

mySeries = (data['rank'] == 'A') | (data['rank'] == 'B')
a11
  • 3,122
  • 4
  • 27
  • 66