How do I concatenate two columns having lists such that there are no duplicates in the resultant list.
df:
A B
[a,b] [c,d,a,b]
[s,d] [d,f]
Expected Result in new column:
A_B
[a,b,c,d]
[s,d,f]
How do I concatenate two columns having lists such that there are no duplicates in the resultant list.
df:
A B
[a,b] [c,d,a,b]
[s,d] [d,f]
Expected Result in new column:
A_B
[a,b,c,d]
[s,d,f]
df.sum(1).map(set).map(list).to_frame('_'.join(df))
A_B
0 [a, d, b, c]
1 [s, d, f]
But probably better
pd.DataFrame(
{'_'.join(df): [[*set().union(*t)] for t in zip(*map(df.get, df))]},
df.index,
)
A_B
0 [a, d, b, c]
1 [s, d, f]
df = pd.DataFrame(dict(A=[[*'ab'], [*'sd']], B=[[*'cdab'], [*'df']]))