I want to convert something like this:
['dog', 'cat', 'fish', 'dog', 'dog', 'bird', 'cat', 'bird']
Into a boolean matrix, one column in the matrix for each classification. For this example, it'd be like this:
(dog) (cat) (fish) (bird)
1 0 0 0
0 1 0 0
0 0 1 0
1 0 0 0
1 0 0 0
0 0 0 1
0 1 0 0
0 0 0 1
Where the value is set to true depending on the classification. I know I could do this iteratively like this (pseudo code):
class = array of classifications
new = array of size [amt of classifications, len(class)]
for i, c in enumerate(class):
if c == 'dog':
new[i][0] = 1
elif c == 'cat':
new[i][1] = 1
# and so on
I feel there's more efficient way of doing that within numpy, or pandas (since i originally have the data as a DataFrame the convert it to a numpy array, so i wouldn't mind having a pandas-solution).