I have a Pandas python dataframe that has one column that is just a list of tags, somewhat like is shown below.
Index | User_Details | Tags
------|--------------|-------
0 | A |[tag_a, tag_b]
1 | B |-
2 | C |[tag_a]
.... | ... |....
This list column has an unknown, varying number of tags and users can have none, one or many of them. They are separated by commas. What I am trying to do is turn it into a boolean table like that shown below:
Index | User_Details | tag_a | tag_a
------|--------------|-------|-------
0 | A |1 |1
1 | B |0 |0
2 | C |1 |0
.... | ... |.... |...
I found some things on here that did this when the tags were limited and all known. Usually there were like only 3 tags, but I'm looking at up to 30ish.
Any ideas?
Thanks
NOTE: This is different to How to one-hot-encode from a pandas column containing a list? as some of my tag rows contain no data. Using any of the methods applied there usually results in a failure along lines of: TypeError: object of type 'float' has no len()