5

I'm building a classifier that I wish to host as a c# win service, exposing an endpoint I can call remotely with text I wish to classify. I currently have one working using ironpython and the natural language toolkit, using c#4.0 dynamics. code like this:

var py = Python.CreateEngine();
dynamic script = py.ImportModule("MyPythonScript");
classifier = script.GetClassifier();
//build features etc, then train
trainedClassifier = classifier.TrainClassifier(featureSet);

The classifier trains itself on startup (like above) and I call the classifier realtime with text I wish to classify.

My issue is I want to use the classifiers and vectorizers in scikit-learn.

Ironpython doesn't support scikit-learn as per this link. (Can scikit be used from IronPython?)

Can anyone suggest the best methodology for this? I'm open to suggestions, but I need to hold the trained classifier in memory, as training it on every call will be prohibitive.

My research has yielded the following.

  1. IronPython 2.7 can support numpy and scipy, (https://www.enthought.com/repo/.iron/). Although when I try to run this I have an issue with NumpyDotNet.dll not being found. I gave up as scikit-learn probably wont work with IronPython anyway.

  2. I've looked at 'python for .net' (http://pythonnet.github.io/), but haven't been able to call it from c#. I reference the Python.Runtime.dll but have the same reference issue as this guy (https://stackoverflow.com/questions/22844519/missing-py-gil-from-c-pythonnet-example)

  3. Has anyone used sharpkit.learn. (https://github.com/foreverzet/Sharpkit.Learn). I specifically need Linear SVM and TfidfVectorizer?

  4. I'm open to other solutions for running a python script. However, I'll need to cache the trained classifier, and can't repeatedly train it.

I'm open to all ideas and any help appreciated. thank you.

Community
  • 1
  • 1
user3661633
  • 71
  • 1
  • 5
  • If you can't make things work in IronPython (as you seem to have established), my solution would be to use "pure" CPython (e.g. the anaconda distribution), and interface to it using various IPC mechanisms available to you. The simplest one being a SimpleHTTPServer running locally, more complex options are available of course. – deets Jun 09 '14 at 16:34
  • Hi thanks for the reply. I'm happy to write a full blown cpython 2.7 app. I've not done one before - my python is limited to prototyping with scripts. I've seen how I can host python as a windows service (http://www.chrisumbel.com/article/windows_services_in_python), I can use pyodbc for data access, adding decent logging, OOP etc. Where I'm unsure is how to host a restful JSON webservice. Flask / CherryPy / web.py are being mentioned but unsure if this'll work on windows. This is going to be fun! – user3661633 Jun 10 '14 at 07:55
  • Don't be afraid :) I would suggest bottle (it's one file, the most minimal thing you can have), pyodbc doesn't seem to be necessary for me. Just start by putting your classification thingy into a package, add bottle and try hooking up calls to your classifier via a web-method. Just invoke it from within the browser (or even better, write unit-tests). Once you feel comfortable with this setup, connect C#, and then eventually convert the whole thing into a service. – deets Jun 10 '14 at 09:01
  • No fear! I've now got a python windows service running, and it's training a scikit learn classifier. I just need to work out how to host a json webservice to call it with text and I'm done. Thanks for your help. Silly question - how do I close this question off and thank you / increase your rep score? – user3661633 Jun 10 '14 at 11:15
  • 3
    sheesh, no idea, I'm new to this game. It seems to penalize discussions like this a bit. Gotta work around that somehow. Your intent is good enough for me though :) – deets Jun 10 '14 at 11:36

2 Answers2

2

marking this question as answered as per the comments above. I couldn't host scikit-learn in iron python and have instead written a service using cpython.

user3661633
  • 71
  • 1
  • 5
0

I know that IronPython from C# to Sklearn/Numpy does not seem to work.

The best approach would be to create a REST API Webservice for the Python Sklearn/Numpy code using a framework like Flask and call the API from the C# code using the HttpClient Class.

This would eliminate the IronPython completely and work independent of the nature of the Python code.