I'm trying to load I sklearn model from an Azure Data Lake, for that I saved the model using joblib:
I created a shared access signature (SAS) for the following code:
from sklearn import svm
from sklearn import datasets
from joblib import dump, load
clf = svm.SVC()
X, y= datasets.load_iris(return_X_y=True)
clf.fit(X, y)
dump(clf, 'model_test.joblib')
Then I uploaded the file to the data lake manually, created a shared access signature (SAS) and now I'm trying to read it back with the following code:
SAS_URL = "https://XXXX"
blob_client = BlobClient.from_blob_url(SAS_URL)
#download_stream = blob_client.download_blob()
downloader = blob_client.download_blob()
b = downloader.readall()
loaded_model = load(b)
But I'm getting this error:
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-65-3cebf1b7dfbc> in <module>
7
8 b = downloader.readall()
----> 9 loaded_model = load(b)
~\Anaconda3\lib\site-packages\joblib\numpy_pickle.py in load(filename, mmap_mode)
575 obj = _unpickle(fobj)
576 else:
--> 577 with open(filename, 'rb') as f:
578 with _read_fileobject(f, filename, mmap_mode) as fobj:
579 if isinstance(fobj, str):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte