I have a dataframe with a column that has a string with comma separated items.
col1
apple, banana, kiwi
apple, banana
banana
I want to make a second column 'col2' that shows the difference between each row.
So I'm trying to turn each row into a set, and subtracting it from the previous row as referred to here: Python comparing two strings to differences
df['col2'] = set(df["col1"].shift(1)) - set(df["col1"])
However I get this error message: "ValueError: Length of values does not match length of index". What am I doing wrong and is there a better way to do what I'm doing?
EDIT: expected output
col1 col2
apple, banana, kiwi
apple, banana kiwi
banana apple