1

I have created an S3 bucket 'testshivaproject' and uploaded an image in it. When I try to access it in sagemaker notebook, it throws an error 'No such file or directory'.

# import libraries
import boto3, re, sys, math, json, os, sagemaker, urllib.request
from sagemaker import get_execution_role
import numpy as np                                   

# Define IAM role
role = get_execution_role()

my_region = boto3.session.Session().region_name # set the region of the instance

print("success :"+my_region)

Output: success :us-east-2

role

Output: 'arn:aws:iam::847047967498:role/service-role/AmazonSageMaker-ExecutionRole-20190825T121483'

bucket = 'testprojectshiva2' 
data_key = 'ext_image6.jpg' 
data_location = 's3://{}/{}'.format(bucket, data_key) 
print(data_location)

Output: s3://testprojectshiva2/ext_image6.jpg

test = load_img(data_location)

Output: No such file or directory

There are similar questions raised (Load S3 Data into AWS SageMaker Notebook) but did not find any solution?

R-R
  • 317
  • 1
  • 6
  • 18

2 Answers2

2

Thanks for using Amazon SageMaker!

I sort of guessed from your description, but are you trying to use the Keras load_img function to load images directly from your S3 bucket?

Unfortunately, the load_img function is designed to only load files from disk, so passing an s3:// URL to that function will always return a FileNotFoundError.

It's common to first download images from S3 before using them, so you can use boto3 or the AWS CLI to download the file before calling load_img.

Alternatively, since the load_img function simply creates a PIL Image object, you can create the PIL object directly from the data in S3 using boto3, and not use the load_img function at all.

In other words, you could do something like this:

from PIL import Image

s3 = boto3.client('s3')
test = Image.open(BytesIO(
    s3.get_object(Bucket=bucket, Key=data_key)['Body'].read()
    ))

Hope this helps you out in your project!

Kevin McCormick
  • 2,358
  • 20
  • 20
  • Hi Kevin. I am having a similar trouble when reading text data from a CSV file that I stored on a s3 bucket. I'm trying to use code similar to the xgboost_abalone example notebook to run predictions in a regression problem, and when using the open(FILE_DATA, 'r') line I get an error. I already checked the path and it is correct. Any ideas? – RafaJM Dec 30 '19 at 19:47
0

You may use the following code to pull in a CSV file into sagemaker.

import pandas as pd

bucket='your-s3-bucket'
data_key = 'your.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)
df = pd.read_csv(data_location)

alternative formatting for data_location variable:

data_location = f's3://{bucket}/{data_key}'
david
  • 137
  • 4