How can I compare lists within two columns of a dataframe and identify if the elements of one list is within the other list and create another column with the missing elements.
The dataframe looks something like this:
df = pd.DataFrame({'A': ['a1', 'a2', 'a3'],
'B': [['b1', 'b2'], ['b1', 'b2', 'b3'], ['b2']],
'C': [['c1', 'b1'], ['b3'], ['b2', 'b1']],
'D': ['d1', 'd2', 'd3']})
I want to compare if elements of column C are in column B and output the missing values to column E, the desired output is:
df = pd.DataFrame({'A': ['a1', 'a2', 'a3'],
'B': [['b1', 'b2'], ['b1', 'b2', 'b3'], ['b2']],
'C': [['c1', 'b1'], ['b3'], ['b2', 'b1']],
'D': ['d1', 'd2', 'd3']
'E': ['b2', ['b1','b2'],'']})