I made a SVC but I am confused in interpreting the results for the probability. Lets say there are 3 categories cat, dog, and fish and I want to know the probability which it can be of each one so I used .predict
to find the prediction and .predict_proba
but it does not come out correct for small samples.
from sklearn import svm
X = [[1,2,3], [2,3,4], [1,1,3], [2,3,5], [3,4,6], [2,3,4],[1,2,3]]
y = ['cat', 'dog', 'cat','dog','fish','dog','cat']
clf =svm.SVC(probability=True)
clf.fit(X, y)
a=clf.decision_function([3,4,6])
b=clf.predict_proba([3,4,6])
c=clf.score(X, y)
print clf.classes_
print 'accuracy', c
print 'Fish prediction'
print clf.predict([3,4,6])
print 'decision function', a
print 'predict', b
If I predict something with low amount of samples like fish it is accurate but can someone explain why the prediction probability is so low: 0.027. (I know it is using Platt Scaling but why was dog not selected at a probability of 0.71) Is there way to obtain the probability which the SVM predicts that the results are fish?
['cat' 'dog' 'fish']
accuracy 1.0
Fish prediction
['fish']
decision function [[-0.25639624 -0.85413901 -0.25966687]]
predict [[ 0.26194797 0.71056399 0.02748803]]
Lets say I want to predict cat:
#predict cat
d=clf.decision_function([1,2,3])
e=clf.predict_proba([1,2,3])
print 'Cat prediction'
print clf.predict([1,2,3])
print 'decision function', d
print 'predict', e
It printed out the correct probability of 0.61
Cat prediction
['cat']
decision function [[ 0.99964652 0.99999999 0.54610562]]
predict [[ 0.61104793 0.19764548 0.19130659]]
Also I think I am using the score
wrong since it is tested against itself and yields the value of 1 meaning that it is 100% accurate. How do I correctly use score
?