I would like to deploy an AzureML web service via a Jupyter notebook on AzureML Studio using a Python2.7 kernel. This web service will be using (in addition to a trained ML model) a previously defined Pandas dataframe.
Now, as I understand from here, any global variables referred to by the published function will be serialized using Pickle so they can be used when the web service is consumed. Indeed, if I publish a web service that makes use of a previously defined integer, function, or numpy array I can consume the web service without a problem:
from azureml import services
import pandas as pd
import numpy as np
#Define some function...
def square(n):
return n*n
#... and some variables we might want to use in the web service below
df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)))
x = df.values
#Create the web service
@services.publish(workspace_id, authorization_token)
@services.types(number=int)
@services.returns(int)
def my_published_func(number):
tmp = sum(sum(x == number))
return square(tmp)
However, as soon as I try to use the dataframe df
inside my_published_func
(e.g. when replacing x == number
with df.values == number
), a call to the web service gives the following error (which seems to indicate some serializing-related issue):
{u'error': {u'message': u'Module execution encountered an error.', u'code': u'ModuleExecutionError', u'details': [{u'message': u'Error 0085: The following error occurred during script evaluation, please view the output log for more information:\r\n---------- Start of error message from Python interpreter ----------\r\nCaught exception while executing function: Traceback (most recent call last):\n File "\server\InvokePy.py", line 118, in executeScript\n mod = safe_module_import(script)\n File "\server\InvokePy.py", line 79, in safe_module_import\n return import_module(h)\n File "C:\pyhome\lib\importlib\__init__.py", line 37, in import_module\n import(name)\n File "\temp\1263948264.py", line 1124, in \n __user_function = _deserialize_func(base64.decodestring(\'long string of gibberish goes here'), globals())\n File "\temp\1263948264.py", line 1101, in _deserialize_func\n codeArgs, funcArgs, updatedGlobals = pickle.loads(data)\nImportError: No module named indexes.base\n\r\n\r\n---------- End of error message from Python interpreter ----------', u'code': u'85', u'target': u'Execute Python Script RRS'}]}}
Does this mean that global variables referring to Pandas dataframes (even though these can be pickled) cannot be used inside an AzureML web service, or is there a way to make this work?