1

Say I have a sklearn training data:

features, labels = assign_dataSets() #assignment operation

Here the feature is a 2-D array, whereas label consists is a 1-D array consisting of values [0,1]

The classification operation:

f1x = [features[i][0] for i in range(0, len(features)) if labels[i]==0]
f2x = [features[i][0] for i in range(0, len(features)) if labels[i]==1]
f1y = [features[i][1] for i in range(0, len(features)) if labels[i]==0]
f2y = [features[i][1] for i in range(0, len(features)) if labels[i]==1]

Now I plot the said data:

import matplotlib.pyplot as plt
plt.scatter(f1x,f1y,color='b')
plt.scatter(f2x,f2y,color='y')
plt.show()

Now I want to run the fitting operation with a classifier for example SVC.

from sklearn.svm import SVC
clf = SVC()
clf.fit(features, labels)

Now my question is as support vectors are really slow, is there a way to monitor the decision boundary of the classifier in real-time (I mean as the fitting operation is occurring)? I know that I can plot the decision boundary after the fitting operation has occurred, but I want the plotting of the classifier to occur in real time. Perhaps with threading and running predictions of an array of points declared by a linespace. Does fit function even allow such operations, or do I need to go for a some other library?

Just so you know, I am new to machine-learning.

Souyama
  • 116
  • 2
  • 10
  • 2
    I don't think there is support for callback-like functionality needed to grab unfinished solutions. Keep in mind, that SVC is based on libsvm, which is wrapped. – sascha Jan 29 '18 at 16:14

1 Answers1

3

scikit-learn has this feature, but it's is limited to a few classifiers from my understanding (e.g. GradientBoostingClassifier, MPLClassifier). To turn on this feature, you need to set verbose=True. For example:

clf = GradientBoostingClassifier(verbose=True)

I tried it with SVC and didn't work as expected (probably for the reason sascha mentioned in the comment section). Here is a different variation of your question on StackOverflow.

With regards to your second question, if you switch to Tensorflow (another machine learning library), you can use the tensorboard feature to monitor a few of metrics (e.g. error decay) in real time.

However, to the best of my knowledge SVM implementation is still experimental in v1.5. Tensorflow is really good when working with neural network based models.

If you decide to use a DNN for classification using Tensorflow then here is a discussion about implementation on StackOverflow: No easy way to add Tensorboard output to pre-defined estimator functions DnnClassifier?

Useful References:

Tensorflow SVM (only linear support for now - v1.5): https://www.tensorflow.org/api_docs/python/tf/contrib/learn/SVM

Tensorflow Kernals Methods: https://www.tensorflow.org/versions/master/tutorials/kernel_methods

Tensorflow Tensorboard: https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard

Tensorflow DNNClassifier Estimator: https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier

sinapan
  • 948
  • 1
  • 9
  • 23
  • 1
    Yes. First, make sure your Keras setup is running on Tensorflow. Then you can you `keras.callbacks.TensorBoard` to do the same thing. There is already a StackOverflow question on the implementation. Here is the [link](https://stackoverflow.com/questions/42112260/how-do-i-use-the-tensorboard-callback-of-keras) – sinapan Jan 31 '18 at 18:49
  • 1
    Thank you. I fixed the typos. I think for NNs the speed heavily depends on the number of samples, number of training iterations and the total number of neurons. – sinapan Feb 01 '18 at 20:51