I am trying to clean up data in one dataframe by values from other dataframe's column. The first dataframe contains a semicolon seperated list of values, the second dataframe contains single words. After cleaning the first dataframe must not contain any words from the second dataframe.
data df1 data df2
x1;x2;x3 x1
key2;key6;key7;key8 x2
key6
key8
I need to remove from data df1
, values present in data df2
. I am trying to convert two columns from different dfs, into two lists and remove from list1
of df1
, the values present in list2
of df2
.
Is there a faster way of doing this without a loop considering that data df2
column may have over 1M rows and in data df1
column I have more than one value on a row?