This question is conceptually similar to the question here: Python Pandas: How to create a binary matrix from column of lists?, but due to the size of my data, I do not want to convert into a pandas data frame.
I have a list of lists like the following,
list_ = [[5, 3, 5, 2], [6, 3, 2, 1, 3], [5, 3, 2, 5, 2]]
And I would like a binary matrix with each unique value as a column, and each sublist as a row.
How could this be done efficiently on over 100000 sublists with around 1000 items each?
Edit:
Example output is similar to the output in the question linked above, where the list could essentially be considered as:
list_ = [["a", "b"], ["c"], ["d"], ["e"]]
a b c d e
0 1 1 0 0 0
1 0 0 1 0 0
2 0 0 0 1 0
3 0 0 0 0 1