3

I'm trying to load I sklearn model from an Azure Data Lake, for that I saved the model using joblib:

I created a shared access signature (SAS) for the following code:

from sklearn import svm
from sklearn import datasets
from joblib import dump, load

clf = svm.SVC()
X, y= datasets.load_iris(return_X_y=True)
clf.fit(X, y)
dump(clf, 'model_test.joblib') 

Then I uploaded the file to the data lake manually, created a shared access signature (SAS) and now I'm trying to read it back with the following code:

SAS_URL  = "https://XXXX"
blob_client = BlobClient.from_blob_url(SAS_URL)
#download_stream = blob_client.download_blob()
downloader = blob_client.download_blob()
b = downloader.readall()
loaded_model = load(b)

But I'm getting this error:

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-65-3cebf1b7dfbc> in <module>
      7 
      8 b = downloader.readall()
----> 9 loaded_model = load(b)

~\Anaconda3\lib\site-packages\joblib\numpy_pickle.py in load(filename, mmap_mode)
    575             obj = _unpickle(fobj)
    576     else:
--> 577         with open(filename, 'rb') as f:
    578             with _read_fileobject(f, filename, mmap_mode) as fobj:
    579                 if isinstance(fobj, str):

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
Luis Ramon Ramirez Rodriguez
  • 9,591
  • 27
  • 102
  • 181
  • 2
    You can refer to [Python pickle error: UnicodeDecodeError](https://stackoverflow.com/questions/32957708/python-pickle-error-unicodedecodeerror), [UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 3131: invalid start byte](https://stackoverflow.com/questions/38518023/unicodedecodeerror-utf8-codec-cant-decode-byte-0x80-in-position-3131-invali) and [UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte](https://stackoverflow.com/questions/52610862/unicodedecodeerror-utf-8-codec-cant-decode-byte-0x80-in-position-0-invalid) – Ecstasy Sep 21 '21 at 08:26
  • have a look at this page https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python . Please find that is it Data lake storage Gen2 or Gen1 ? – Bheeshma Sep 24 '21 at 06:54

0 Answers0