Loading data from S3 bucket to SageMaker Jupyter Notebook - ValueError - Invalid bucket name

Question

following the answers to this question Load S3 Data into AWS SageMaker Notebook I tried to load data from S3 bucket to SageMaker Jupyter Notebook.

I used this code:

import pandas as pd

bucket='my-bucket'
data_key = 'train.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)

pd.read_csv(data_location)

I replaced 'my-bucket' by the ARN (Amazon Ressource name) of my S3 bucket (e.g. "arn:aws:s3:::name-of-bucket") and replaced 'train.csv' by the csv-filename which is stored in the S3 bucket. Regarding the rest I did not change anything at all. What I got was this ValueError:

ValueError: Failed to head path 'arn:aws:s3:::name-of-bucket/name_of_file_V1.csv': Parameter validation failed:
Invalid bucket name "arn:aws:s3:::name-of-bucket": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"

What did I do wrong? Do I have to modify the name of my S3 bucket?

I found it: I just had to replace `my-bucket` by `name-of-bucket` without the complete ARN, so without `arn:aws:s3:::`. :-D — Tobitor, Feb 17 '21 at 10:39

score 1 · Accepted Answer · answered Feb 17 '21 at 10:40

1

The path should be:

data_location = 's3://{}/{}'.format(bucket, data_key)

where bucket is <bucket-name> not ARN. For example bucket=my-bucket-333222.

answered Feb 17 '21 at 10:40

Marcin

215,873
14
235
294

Does each sagemaker session has its own default bucket or all sessions share same default bucket? Also if later is true, then do all notebook instances have same default bucket? – Neo Jun 30 '23 at 21:22

Loading data from S3 bucket to SageMaker Jupyter Notebook - ValueError - Invalid bucket name

1 Answers1