5

I have been working on designing REST api using springframework and deploying them on web servers like Tomcat. I have also worked on building Machine Learning model and use the model to make prediction using sklearn in Python. Now I have a use case where in I want to expose a REST api which builds Machine Learning Model, and another REST api which makes the prediction. What architecture should help me to achieve the same. (An example of the same maybe a Amazon Machine Learning. They have exposed REST api for generating model and making prediction)

I searched round the internet and found following ways:

  1. Write the whole thing in Java - ML model + REST api
  2. Write the whole thing in Python - ML model + REST api

But playing around with Machine Learning, its models and predictions is really easier and more supported in python with libraries like sklearn, rather than Java. I would really like to use python for Machine Learning part.

I was thinking about and approach wherein I write REST api using JAVA but use sub-process to make python ML calls. Will that work?

Can someone help me regarding the probable architectural approaches that I can take. Also please suggest the most feasible solution.

Thanks in advance.

harshlal028
  • 1,539
  • 1
  • 16
  • 25
  • The Skymind Intelligence Layer includes a machine learning model server with a REST API. https://docs.skymind.ai/v1.0.3/reference – racknuf Mar 18 '18 at 20:56
  • If you don't mind using Amazon Web Services, I would recommend Chalice. It's a framework for creating Lambda functions. It's very easy to learn and you won't have to worry about the infrastructure. If you have stored your ML model in a file, you can transfer it to a Bucket, so when the Lambda function is invoked, you can pull the model file and process the request. https://github.com/aws/chalice – sagar1025 Dec 30 '19 at 02:15
  • If you're looking for an easy way to build a custom machine learning API without even having to worry about the backend, you could check out https://www.nyckel.com – JerSchneid Mar 24 '21 at 17:46

6 Answers6

4

As others mentioned,

  1. using AzureML is easy solution to deploy ML model as web service/ rest service. However, you need to build the model in Azure platform using graphical interface (drag and drop, configure). People may not like this approach if they have used python -sklearn code build a model. Though, AzureML has option to include R and python script, i did not like it much.

  2. Another option is to store the python ML model as .pkl file and using Flask / DJango rest framework, deploy the model. client apps can consume the rest service. Here is an excellent tutorial on youtube. https://www.youtube.com/watch?v=s-i6nzXQF3g

2

From what ive done in the past i suggest 2 options(maybe theres more but this are the ones that i have implemented)

  1. If you have access and budget to cloud services, Azure ML its excelent choice, greate ML framework and environment, and to create your rest API you just need like 2 clicks to expose it ,and then consume it using JSON from any language.
  2. Use scikit-learn and code your REST API in python , but can be consumed from any language, this option is not as easy and user friendly as Azure ML because you will have to code everything by hand and play with the model persistence functions of scikit, but once exposed, you can use it in java(or anything else) . I used this as a reference : https://loads.pickle.me.uk/2016/04/04/deploying-a-scikit-learn-classifier-to-production/
  3. Spark MLlib: i havent tried this option, but i asked myself a question here in stack overflow and got some interesting answers: How to serve a Spark MLlib model?
Community
  • 1
  • 1
Luis Leal
  • 3,388
  • 5
  • 26
  • 49
0

Well it depends the situation you use python for ML. For classification models like randomforest,use your train dataset to built tree structures and export as nested dict.Whatever the language you uesd,transform the model object to a kind of data structure then you can ues it anywhere.

BUT if your situation is a large scale,real-timeing,distributional datesets,far as I know,maybe the best way is to deploy the whole ML process on severs.

YJ.Lee
  • 1
0

I'm using Node.js as my rest service and I just call out to the system to interact with my python that holds the stored model. You could always do that if you are more comfortable writing your services in JAVA, just make a call to Runtime exec or use ProcessBuilder to call the python script and get the reply back.

Travis Walsh
  • 16
  • 1
  • 3
  • Hi. It is very desirable to remember the security issues when Runtime exec is used: https://stackoverflow.com/questions/11268189/security-concerns-with-runtime-exec – Duloren Oct 24 '17 at 19:45
0

By far, the fastest way to get your sklearn model into an API is FlashAI.io , the service was made for this purpose specifically – I came into this when I was facing the same dilemma recently as I had trained a Scikit-learn model on my local PC using Python, and I wanted to quickly expose it in an API that could be called via an HTTP POST request.

There are other options that were mentioned, all of which require some learning curve, cost in time and effort to simply expose your model. FlashAI lets you expose your model within a couple minutes. Just save your .pkl file and upload it. Your model gets assigned a unique model ID and you just use that to make API requests without any limit. Done and done :)

erikW
  • 181
  • 1
  • 3
0

I have been experimenting with this same task and would like to add another option, not using a REST API: The format of the Apache Spark models is compatible in both the Python and Jave implementations of the framework. So, you could train and build your model in Python (using PySpark), export, and import on the Java side for serving/predictions. This works well.

There are, however, some downsides to this approach:

  • Spark has two separate ML packages (ML and MLLib) for different data formats (RDD and dataframes)
  • The algorithms for training models in each of these packages are not the same (no model parity)
  • The models and training classes don't have uniform interfaces. So, you have to be aware of what the expected format is and might have to transform your data accordingly for both training and inference.
  • Pre-processing for both training and inference has to be the same, so you either need to do this on the Python side for both stages or somehow replicate the pre-processing on the Java side.

So, if you don't mind the downsides of a Rest API solution (availability, network latency), then this might be the preferable solution.

martin_wun
  • 1,599
  • 1
  • 15
  • 33