0

I am importing a model from h2o flow into h2o steam and deploying it as a prediction service. A problem that I am having is that the model has a date input feature that was converted to a time type field when loading the training data .csv for the model in h2o flow.

csv parsing setup in <code>h2o flow</code>

These time values are converted to (I think) POSIX timestamps (in milliseconds) in the parsed .hex file in h2o flow.

parsed .hex file in <code>h2o flow</code>

Thus, when I deploy models trained on this data in steam's prediction service, the input fields expect Double values (the timestamps) rather than any kind of date string (eg. "2016-12-21") which human users of this service are expecting to be able to enter. This is the error that the steam prediction service gives me for input date 2016-12-21.

enter image description here

Is there any way around this? The service needs to be used by humans and having to have users enter POSIX millisecond timestamps conversions of actual dates makes it unusable. Currently just using a model that does not include date inputs.

lampShadesDrifter
  • 3,925
  • 8
  • 40
  • 102

1 Answers1

0

The prediction service uses the same format as the model was trained on. If the model used timestamps as input, the service will too. You need to add your own preprocessing to convert, e.g., 2016-12-21 to a timestamp before calling the prediction service.

Magnus
  • 246
  • 1
  • 4
  • Could you point me towards any resources, tutorials, examples, or official docs of how to do this? This post (https://stackoverflow.com/q/43766401/8236733), lead me here (https://github.com/h2oai/steam/blob/master/prediction-service-builder/examples/spam-detection-python/score.py), but I'm not fully sure what to make of it. Thanks for all your help. – lampShadesDrifter Sep 22 '17 at 22:15
  • StackOverflow has the answer: https://stackoverflow.com/questions/21711030/java-converting-a-date-to-a-different-format Use DateFormat.parse() – Magnus Sep 22 '17 at 22:27
  • 1
    Thank you for helping me. I know how to convert between POIX time-stamps and date-strings (from working with MOJOs in local CLI java programs). I was more lost in terms of what a *preprocessing script* looks like for deploying a model from `steam`, since it's not clear to me how the .py gets used or called by steam (or the model?). – lampShadesDrifter Sep 22 '17 at 22:33
  • I may be confused about what you are suggesting. I am wondering if there is a way to convert data input from the user the in prediction service using a preprocessing .py script after the "predict" button is pressed but before the model attempts to predict on the entered inputs. – lampShadesDrifter Sep 22 '17 at 22:42
  • OK, I see. Yes there are 2 possibilities for preprocessing, either using Java or Python. There's one example for each of them. Java: https://github.com/h2oai/steam/tree/master/prediction-service-builder/examples/spam-prejar and Python: https://github.com/h2oai/steam/tree/master/prediction-service-builder/examples/spam-detection-python Check out the readme for the Python example for a longer explanation of how it works. – Magnus Sep 22 '17 at 23:47