4

I am trying to use MLflow in R. According to https://www.mlflow.org/docs/latest/models.html#r-function-crate, the crate flavor needs to be used for the model. My model uses the Random Forest function implemented in the ranger package:

model <- ranger::ranger(formula    = model_formula, 
                        data       = trainset,
                        importance = "impurity", 
                        probability=T, 
                        num.trees  = 500, 
                        mtry       = 10)

The model itself works and I can do the prediction on a testset:

test_prediction <- predict(model, testset)

As a next step, I try to bring the model in the crate flavor. I follow here the approach shown in https://docs.databricks.com/_static/notebooks/mlflow/mlflow-quick-start-r.html.

predictor <- crate(function(x) predict(model,.x))

This results however in an error, when I apply the "predictor" on the testset

predictor(testset)
Error in predict(model, .x) : could not find function "predict"

Does anyone know how to solve this issue? To I have to transfer the prediction function differently in the crate function? Any help is highly appreciated ;-)

Fabian S.
  • 41
  • 1

1 Answers1

1

In my experience, that Databricks quickstart guide is wrong.

According to the Carrier documentation, you need to use explicit namespaces when calling non-base functions inside of crate. Since predict is actually part of the stats package, you'd need to specify stats::predict. Also, since your crate function depends on the global object named model, you'd need to pass that as an argument to the crate function as well.

Your code would end up looking something like this (I can't test it on your exact use case, since I don't have your data, but this works for me on MLflow in Databricks):

model <- ranger::ranger(formula    = model_formula, 
                        data       = trainset,
                        importance = "impurity", 
                        probability=T, 
                        num.trees  = 500, 
                        mtry       = 10)

predictor <- crate(function(x) {
    stats::predict(model,x)
    }, model = model)

predictor(testset)

danh
  • 618
  • 3
  • 7