1

I have a dataframe that contains a column of lists.

For example, some entries may be:

A,B,C
A,C,D
D,A
C
G,E,F,D,C,A

I would to be able to one-hot-embed this such that I end up with:

A B C D E F G 
1 0 1 1 0 0 0
1 0 0 1 0 0 0
0 0 1 0 0 0 0
1 0 0 1 0 0 0
1 1 1 1 1 1 1

Is there a clean way to do this? Currently I'm looping through each row, and then each item, to embed, and then sum-combining duplicate columns but there has to be a better way.

Cicero
  • 11
  • 1

0 Answers0