I have an Azure function written in Python which has a simple purpose: return a prediction for a new observation based on a model I have trained, tested, and stored as a BLOB. I created the model using a Jupyter notebook and uploaded it to Azure BLOB Storage. I can read the model file, but when I try to unpickle it I get an error: Exception: UnpicklingError: invalid load key, '\xef'.
I'm new to ML and Azure functions so I'm not sure where to start. I've tried loading the model locally and it works fine. I've tried downloading the file back from Azure Storage and it works fine.
The PKL file is generated from a notebook like this:
pickle.dump(model, open("diabetes-model.pkl", "wb"))
In my Azure function I'm passing a func.InputStream to a method that looks like this:
def do_prediction(modelFileStream):
mod = modelFileStream.read()
modelFileStream.close()
model = pickle.loads(mod)
The file starts like this in the debugger (it's almost 400KB):
b'\xef\xbf\xbd\x03cxgboost.sklearn\nXGBClassifier\nq\x00)\xef\xbf\xbdq\x01}q\x02(X\t\x00\x00\x00max_depthq\x03K\x0cX\r\x00\x00\x00learning_rateq\x04G?\xef\xbf\xbdz\xef\xbf\xbdG\xef\xbf\xbd\x14{X\x0c\x00\x00\x00n_estimatorsq\x05M,\x01X\t\x00\x00\x00verbosityq\x06K\x01X\x06\x00\x00\x00silentq\x07NX\t\x00\x00\x00objectiveq\x08X\x0f\x00\x00\x00binary:logisticq\tX\x07\x00\x00\x00boosterq\nX\x06\x00\x00\x00gbtreeq\
The error is: Exception: UnpicklingError: invalid load key, '\xef'.
I'm guessing there is some kind of an encoding issue here. I've seen some guidance that the contents should be Base64 encoded before being written, but that seems inefficient to me.
Would love some guidance on what is going on or what to try next.