In Databricks, I have used mlflow and got my model served through REST API. It works fine when all model features are provided. But my use case is that only a single feature (the primary key) will be provided by the consumer application, and my code has to lookup the other features from a database based on that key and then use the model.predict to return the prediction. I tried researching but understood that the REST endpoints will simply invoke the model.predict function. How can I make it invoke a data massaging function before predicting?
Asked
Active
Viewed 279 times
1 Answers
0
There are two approaches for that:
You can use custom MLflow model, where you override the
predict
function, and it will call database or other source for an additional information, and then call actualpredict
of the model. You can find more information in following answers: 1, 2.Use Databricks Feature Store for your data, train & log model using the FeatureStoreClient.log_model function, then publish feature store tables into a database, and then use model via model serving, and it will automatically lookup for features.

Alex Ott
- 80,552
- 8
- 87
- 132
-
Thank you! This will get me started on exploring other means to the solution. I started working on FunctionTransformer to add a custom function to my sklearn pipeline to eventually use pipeline.predict. – Bhawik Raja Feb 15 '22 at 14:02