I have a Pandas dataframe (tempDF) of 5 columns by N rows. Each element of the dataframe is an object (string in this case). For example, the dataframe looks like (this is fake data - not real world):
I have two tuples, each contains a collection of numbers as a string type. For example:
codeset = ('6108','532','98120')
additionalClinicalCodes = ('131','1','120','130')
I want to retrieve a subset of the rows from the tempDF in which the columns "medcode" OR "enttype" have at least one entry in the tuples above. Thus, from the example above, I would retrieve a subset containing rows with the index 8 and 9 and 11.
Until updating some packages earlier today (too many now to work out which has started throwing the warning), this did work:
tempDF = tempDF[tempDF["medcode"].isin(codeSet) | tempDF["enttype"].isin(additionalClinicalCodes)]
But now it is throwing the warning:
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
mask |= (ar1 == a)
Looking at the API, isin states the the condition "if ALL" is in the iterable collection. I want an "if ANY" condition.
UPDATE #1
The problem lies with using the |
operator, also the np.logical_or
method. If I remove the second isin
condition i.e., just keep tempDF[tempDF["medcode"].isin(codeSet)
then no warning is thrown but I'm only subsetting on the one possible condition.