1

Once a SparkML model has been trained on a Spark cluster, how can I take the trained model and make it available for scoring through a restful API?

The problem is that it requires a SparkContext in order to be loaded, but is there a way to 'fake it' since it does not seem really necessary, or what is the minimum required to create a SparkContext?

Thomas
  • 676
  • 3
  • 18

1 Answers1

1

In some cases - yes, it can.

Many models in Spark can be exported to JPMML, standarized format for ML models. Then you can use it with other Java library like https://github.com/jpmml/jpmml-sparkml

How to export you can read in this question - Spark ml and PMML export.

You can also use Spark Streaming to calculate values, however it will have higher latency until Continous Processing Mode being available

For very time-consuming calculations, such as recommendation algorithms, it's I think quite normal to pre-calculate values and save in database like Cassandra

T. Gawęda
  • 15,706
  • 4
  • 46
  • 61