So far I have resourced another post and sklearn documentation
So in general I want to produce the following example:
X = np.matrix([[1,2],[2,3],[3,4],[4,5]])
y = np.array(['A', 'B', 'B', 'C', 'D'])
Xt = np.matrix([[11,22],[22,33],[33,44],[44,55]])
model = model.fit(X, y)
pred = model.predict(Xt)
However for output, I would like to see 3 columns per observation as output from pred
:
A | B | C
.5 | .2 | .3
.25 | .25 | .5
...
and a different probability for each class showing up in my prediction.
I believe that the best approach would be Multilabel classification
from the second link I provided above. Additionally, I think it might be a good idea to hop into one of the multi-label
or multi-output
models listed below:
Support multilabel:
sklearn.tree.DecisionTreeClassifier
sklearn.tree.ExtraTreeClassifier
sklearn.ensemble.ExtraTreesClassifier
sklearn.neighbors.KNeighborsClassifier
sklearn.neural_network.MLPClassifier
sklearn.neighbors.RadiusNeighborsClassifier
sklearn.ensemble.RandomForestClassifier
sklearn.linear_model.RidgeClassifierCV
Support multiclass-multioutput:
sklearn.tree.DecisionTreeClassifier
sklearn.tree.ExtraTreeClassifier
sklearn.ensemble.ExtraTreesClassifier
sklearn.neighbors.KNeighborsClassifier
sklearn.neighbors.RadiusNeighborsClassifier
sklearn.ensemble.RandomForestClassifier
However, I am looking for someone who is has more confidence and experience at doing this the right way. All feedback is appreciated.
-bmc