1

I have trained a linear regression model and saved the model in a pkl file, with the following code:

import pickle

# save the model
filename = 'linear_model.pkl'
pickle.dump(mod, open(filename, 'wb'))
  
# load the model
load_model = pickle.load(open(filename, 'rb'))
 

After that I tried to use the model in Lambda by importing the pkl file. I did a lot of research and many attempts, but could not figure out how to do this. My last attempt was this one:

from io import BytesIO
import pickle
import boto3
import base64
import json

s3_client = boto3.client('s3')


def lambda_handler(event, context):
    # getting bucket and object key from event object
    source_bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    data = BytesIO()
    s3_client.download_fileobj(source_bucket, key, data)
    data.seek(0)    # move back to the beginning after writing
    print("Data", data.read())

    
    load_model = pickle.load(open(data, 'rb'))
    print("load_model", load_model)
    
    y_pred = load_model.predict([[140000]])
[ERROR] TypeError: expected str, bytes or os.PathLike object, not _io.BytesIO
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 25, in lambda_handler
    load_model = pickle.load(open(data, 'rb'))

I tried to fix this error using data.read(), but if you do that no file is found in the directory.

Mehmet Güngören
  • 2,383
  • 1
  • 9
  • 16
Diego A
  • 155
  • 7

1 Answers1

1
  • data is a BytesIO but you try to open it like a file. You can not do it on BytesIO.
  • Don't read from stream.
  • pickle.load accepts a file. But you have data stream. You need to use loads function.

Update version of lambda function.

from io import BytesIO
import pickle
import boto3
import base64
import json

s3_client = boto3.client('s3')


def lambda_handler(event, context):
    # getting bucket and object key from event object
    source_bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    data = BytesIO()
    s3_client.download_fileobj(source_bucket, key, data)
    data.seek(0)    # move back to the beginning after writing
    
    load_model = pickle.loads(data.read())
    print("load_model", load_model)
    
    y_pred = load_model.predict([[140000]])
Mehmet Güngören
  • 2,383
  • 1
  • 9
  • 16
  • Hi @Mehmet Güngören I followed your answer, but ended up with another error. `[ERROR] TypeError: a bytes-like object is required, not '_io.BytesIO' Traceback (most recent call last): File "/var/task/lambda_function.py", line 24, in lambda_handler load_model = pickle.loads(data)`. – Diego A Jan 21 '23 at 15:09
  • Just do this, you will get bytes like object from BytesIO --> `load_model = pickle.loads(data.read())` – Mehmet Güngören Jan 21 '23 at 15:13
  • Thanks @Mehmet Güngören. It worked, only the timeout needs to be increased, which is already answered here: [Edit Timeout](https://stackoverflow.com/questions/62948910/aws-lambda-errormessage-task-timed-out-after-3-00-seconds). – Diego A Jan 21 '23 at 18:20