1

I need to train a model on aws sagemaker. I'm unable to access data in Jupiter notebook of sagemaker from S3 bucket. My bucket name is "riceleaf" there are four folders in the bucket named as s1,s2,s3,s4 and each folder contains 330 images named as 1.jpg and so on. It is created in Us-east zone. Bucket is private.

One way i did was to access the object and when i displayed the key it shows me 1.jpg and so on. But when i try to open that image it didn't work. So i think I'm unable to get exact data path.

In my code I need exact data path since I'm doing some random data generation in the code so need to access different folders. Therefore, I need a path till bucket so i can change next folder name and image name randomly in my code.

Please help me to so that I can access the images in the Jupiter notebook of sagemaker.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Does this answer your question? [Load S3 Data into AWS SageMaker Notebook](https://stackoverflow.com/questions/48264656/load-s3-data-into-aws-sagemaker-notebook) – rok Mar 21 '22 at 08:38

2 Answers2

1

If you can list the keys but not open a file (or download), make sure your notebook's execution role has s3:GetObject permissions on your riceleaf bucket. The default execution role will only have permissions to access a bucket that has sagemaker in its name.

Once your permissions are set, you can use the S3 Paginator from boto3 to list all your objects for your training.

durga_sury
  • 869
  • 4
  • 6
  • Actually the bucket has to start with `sagemaker-` – Simone Mar 03 '23 at 09:05
  • Not required, the SageMakerFullAccess allows GetObject access to any bucket with `sagemaker`, `Sagemaker`, or `SageMaker` in the name. Please check the IAM policy. – durga_sury Mar 03 '23 at 17:26
  • I created a new notebook and I had issues. I saw that the default policy attached had `"arn:aws:s3:::sagemaker-*"` in https://console.aws.amazon.com/iam/home?#/roles/AmazonSageMakerServiceCatalogProductsUseRole – Simone Mar 04 '23 at 11:55
  • 1
    Your role seems to be the `AmazonSageMakerServiceCatalogProductsUseRole`, which is different from `AmazonSageMakerFullAccess` policy, which is generally set as the default policy for SageMaker roles - https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonSageMakerFullAccess – durga_sury Mar 06 '23 at 00:36
0

what is the error message you are getting?

using boto3 sdk with s3 you can do somehting like this:

import boto3
s3 = boto3.resource('s3')
for key in bucket.objects.all():
  print 's3://{}/{}.format(bucket,key.key)