I have a DataFrame with a list of variables within each column. I cannot figure out how to One-Hot Encode the data within each column.
In:
lst = [['Red, Blue, Yellow', 'Blue, Green, Yellow', 'Green, Red, Blue'], ['Yellow, Red, Blue', 'Blue, Red, Green', 'Yellow, Blue, Red'], ['Yellow, Red, Green', 'Red, Yellow, Blue', 'Green, Blue, Red']]
df = pd.DataFrame(lst, columns =['A', 'B', 'C'], dtype = float)
Out:
A B C
Ella Red, Blue, Yellow Blue, Green, Yellow Green, Red, Blue
Mike Yellow, Red, Blue Blue, Red, Green Yellow, Blue, Red
Dave Yellow, Red, Green Red, Yellow, Blue Green, Blue, Red
I am looking to create it with multi-tiered column headings to look like the below:
A B C
Red Blue Green Yellow Red Blue Green Yellow ....
Ella 1 1 0 1 0 1 1 1 ....
Mike 1 1 0 1 1 1 1 0 ....
Dave 1 0 1 1 1 1 0 1 ....
I would be so grateful for some guidance as I've been stuck on this for a while!