I am running an MLflow experiment as a part of it I would like to log a few artifacts as a python pickle.
Ex: Trying out different categorical encoders, so wanted to log the encoder objects as a pickle file.
Is there a way to achieve this?
I am running an MLflow experiment as a part of it I would like to log a few artifacts as a python pickle.
Ex: Trying out different categorical encoders, so wanted to log the encoder objects as a pickle file.
Is there a way to achieve this?
There are two functions for there:
so it would be as simple as:
with mlflow.start_run():
mlflow.log_artifact("encoder.pickle")
And you will need to use the custom MLflow model to use that pickled file, something like this:
import mlflow.pyfunc
class my_model(mlflow.pyfunc.PythonModel):
def __init__(self, encoders):
self.encoders = encoders
def predict(self, context, model_input):
_X = ...# do encoding using self.encoders.
return str(self.ctx.predict([_X])[0])
Thank you Alex for providing the relevant documentation.
Here is how I do it:
Saving the encoder
from sklearn.preprocessing import OneHotEncoder
import mlflow.pyfunc
encoder = OneHotEncoder()
encoder.fit(X_train)
class EncoderWrapper(mlflow.pyfunc.PythonModel):
def __init__(self, encoder):
self.encoder = encoder
def predict(self, context, model_input):
return self.encoder.transform(model_input)
# Wrap the encoder
encoder_wrapped = EncoderWrapper(encoder)
# Log and save the encoder
encoder_path = ...
mlflow.pyfunc.save_model(python_model=encoder_wrapped, path=encoder_path)
mlflow.pyfunc.log_model(python_model=encoder_wrapped, artifact_path=encoder_path)
Loading the encoder
encoder_path = ...
encoder = mlflow.pyfunc.load_model( encoder_path )
X_test_encoded = encoder.transform(X_test)