-1

I have a column with in a dataframe that I want to filter the dataframe with based on which rows contain values that are in another dataframe.

In other words, I have a blacklist of keywords that I want to make sure are not in a dataframe.

Tom Woods
  • 19
  • 2

1 Answers1

0

To select rows whose column value is in an iterable, some_values, use isin:

df.loc[df['column_name'].isin(some_values)]

You can convert your whitelist of keywords to a set, and use the solution above. A similar question that I refer to is here.

LEGION GREEN
  • 151
  • 6
  • Thank you for this, but I need a way to filter based on whether or not the values contain eg df2 = df1.loc[ ~df1['userName'].str.contains(df[blacklist]) – Tom Woods Nov 04 '22 at 02:31
  • I suppose what you need is `df2 = df1[~df1['userName'].isin(df[blacklist])`, ref https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql . – LEGION GREEN Nov 04 '22 at 07:25