2

I have a dataframe and I would like to filter it by multiple values within a single column, how can I accomplish this? when I filter by a singular value I usually use df_filtered = df[df['column'] == value], but that isn't working for the 61 values at least as I've tried it. Thank you.

     MRN  ... Result
0  13556832  ...  400.0
1  13556832  ...  400.0
2  13556832  ...  400.0
3  13556832  ...  392.0
4  13556832  ...  400.0

here is a sample of the dataframe (there are about 100k rows, and I need to filter for the 61 MRN values that I have identified for a project. So ultimately I would like to have a separate df that includes all MRN values that I have identified as important.

I am essentially looking for a solution that is similar to the .isin() operator except for 61 values, not 2 max

medlearning
  • 143
  • 1
  • 2
  • 11

2 Answers2

12

Put all 61 MRNs into a list-

mrnList = [val1, val2, ...,val61]

Then filter these MRNs like-

df_filtered = df[df['MRN'].isin(mrnList)]

Keep note of your MRN value's datatype while making mrnList.

atinjanki
  • 483
  • 3
  • 13
1

You can use numpy.where for single conditions

import numpy as np
df_filtered = np.where(df['column'] == value, True, False)

and logical_or, logical_and for multiple conditions

import numpy as np
cond1 = df['column'] == value
cond2 = df['column'] == value2
df_filtered = np.where(np.logical_or(cond1, cond2), True, False)

For filtering by a list of values isin comes in handy

whitelist = []
df_filtered = np.where(np.isin(df['value'], whitelist)), True, False)

For filtering a complete DataFrame isin can be used like

df_filtered = df[df.value.isin(whitelist)]
ldz
  • 2,217
  • 16
  • 21
  • Just noticed: `df_filtered = np.where(np.isin(df['value'], whitelist)), True, False)` should be `df_filtered = np.where(np.isin(df['value'], whitelist), True, False)`. There was an extra `)` – Joe Flack Aug 12 '22 at 19:18