1

When I use label_binarize I do not get the correct number of classes even though I specify it. This is my simple code:

import numpy as np
from sklearn.preprocessing import label_binarize

y = ['tap', 'not_tap', 'tap', 'tap', 'not_tap', 'tap', 'not_tap','not_tap']

y = label_binarize(y, classes=[0, 1])
n_classes = y.shape[1]

I get n_classes= 1. While using this code, I get the warning message:

FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  mask |= (ar1 == a)

Can you tell me how to correctly get n_classes = 2 as in this example?

Thank you!

Joe
  • 357
  • 2
  • 10
  • 32

1 Answers1

2

label_binarize binarizes the values in a one-vs-all fashion

Consider this example

from sklearn.preprocessing import label_binarize
print(label_binarize([1, 6], classes=[1, 2, 4, 6]))

[[1 0 0 0]
[0 0 0 1]]

The columns are the classes [1,2,4,6] and 1 denotes if the value matches the class or not.

The way you're invoking it now (label_binarize(y, classes=[0, 1])), none of the values (tap,no_tap) match any of the classes (0,1) and hence all values are 0.

What you're looking for is a LabelBinarizer

from sklearn.preprocessing import LabelBinarizer

y = ['tap', 'not_tap', 'tap', 'tap', 'not_tap', 'tap', 'not_tap','not_tap']
lb = LabelBinarizer()

label = lb.fit_transform(y)
[[1]
[0]
[1]
[1]
[0]
[1]
[0]
[0]]

n_classes = len(lb.classes_)
#2
Sruthi
  • 2,908
  • 1
  • 11
  • 25