I'm wondering if there's a better way to split a string column into multiple rows by 'comma' and send the column from long to wide. My current approach is only working to some extent.
data = {'id':["ab3e3", "psdds2", "pas13", "ccdf2", "dsda1"],
'fruit':["apple, organge", "others", "dragon fruit, organge", "watermelon", "others"]}
df = pd.DataFrame(data)
lst_col = 'fruit'
x = df.assign(**{lst_col:df[lst_col].str.split(',')})
dt = x.explode('fruit')
dt['counts'] = 1
dt.pivot(index='id', columns='fruit', values = 'counts')
id fruit
0 ab3e3 apple, organge
1 psdds2 others
2 pas13 dragon fruit, organge
3 ccdf2 watermelon
4 dsda1 others
Expected output:
id organge apple dragon fruit others watermelon
ab3e3 1 1 0 0 0
ccdf2 0 0 0 0 1
dsda1 0 0 0 1 0
pas13 1 0 1 0 0
psdds2 0 0 0 1 0