0

I wrote a very simple scikit-learn decision tree to implement XOR:

from sklearn import tree
X = [[0, 0], [1, 1], [0, 1], [1, 0]]
Y = [0, 0, 1, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)

print(clf.predict([0,1]))
print(clf.predict([0,0]))
print(clf.predict([1,1]))
print(clf.predict([1,0]))

predict part generates some warning like this:

DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.

What needs to change and why?

halfer
  • 19,824
  • 17
  • 99
  • 186
kee
  • 10,969
  • 24
  • 107
  • 168
  • 1
    seems like the data you pass to predict should be in the same format as the data you pass to fit. use [[0,1]] instead of [0,1] – MrE Aug 10 '17 at 00:35

3 Answers3

3

The input to clf.predict should be a 2D array. Thus, instead of writing

print(clf.predict([0,1]))

you need to write

print(clf.predict([[0,1]]))

Miriam Farber
  • 18,986
  • 14
  • 61
  • 76
0

The method operates on matrices (2D arrays), rather than vectors (1D arrays). As a convenience, the older code accepted a vector as a 1xN matrix. This led to usage errors as some users forgot which way a vector was oriented (1xN vs Nx1).

The suggestion tells you how to reshape your vector to the proper matrix shape. For constant vectors, just write them as matrices:

clf.predict( [ [0, 1] ] )

The "other direction" (wrong for this application) would be

clf.predict( [ [0], [1] ] )
Prune
  • 76,765
  • 14
  • 60
  • 81
0

As the warning message pointed out, you have single sample to test. Thus you could use reshape or fix as followings,

from sklearn import tree
X = [[0, 0], [1, 1], [0, 1], [1, 0]]
Y = [0, 0, 1, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)

print (clf.predict([[0,1]]))
print (clf.predict([[0,0]]))
print (clf.predict([[1,1]]))
print (clf.predict([[1,0]]))
White
  • 627
  • 4
  • 10