1

I have created connection with S3 using boto3 library. But still i'm not able to access file there. Can you please help me what i'm doing wrong here.

Bucket : itx-acm-pas-dev-incoming-sourcefiles
File Name : Maestro_Data_loop_assignment_content.csv
file Location : itx-acm-cde-prd-incoming-sourcefiles/janssen_learn/TgtFiles/Maestro_Data_loop_assignment_content/

I have tried one code also.

import boto3
import csv
import StringIO
import pandas as pd
from boto.s3.connection import S3Connection

s3 = boto3.client('s3',
         aws_access_key_id='yyyyyyyy',
         aws_secret_access_key='xxxxxxxxxxx')

AWS_KEY = 'yyyyyyyyyy'
AWS_SECRET = 'xxxxxxxxxx'
aws_connection = S3Connection(AWS_KEY, AWS_SECRET)
bucket = aws_connection.get_bucket('itx-acm-pas-dev-incoming-sourcefiles')

file_name = "itx-acm-cde-prd-incoming-sourcefiles/janssen_learn/TgtFiles/Maestro_Data_loop_assignment_content/Maestro_Data_loop_assignment_content.csv"

content = bucket.get_key(file_name).get_contents_as_string()
df = pd.read_csv(StringIO.StringIO(content))

print(df.head(3))
abhishek
  • 41
  • 1
  • 2
  • 10

1 Answers1

0

Please refer to this issue: Read a csv file from aws s3 using boto and pandas

for reading the filename you can also use this:

s3 = boto3.resource('s3')
bucket = s3.Bucket('test-bucket')
# Iterates through all the objects, doing the pagination for you. Each obj
# is an ObjectSummary, so it doesn't contain the body. You'll need to call
# get to get the whole body.
for obj in bucket.objects.all():
    key = obj.key
    body = obj.get()['Body'].read()

You can also do this:

import os
import boto3
import pandas as pd
import sys

if sys.version_info[0] < 3: 
    from StringIO import StringIO # Python 2.x
else:
    from io import StringIO # Python 3.x

# get your credentials from environment variables
aws_id = os.environ['AWS_ID']
aws_secret = os.environ['AWS_SECRET']

client = boto3.client('s3', aws_access_key_id=aws_id,
        aws_secret_access_key=aws_secret)

bucket_name = 'my_bucket'

object_key = 'my_file.csv'
csv_obj = client.get_object(Bucket=bucket_name, Key=object_key)
body = csv_obj['Body']
csv_string = body.read().decode('utf-8')

df = pd.read_csv(StringIO(csv_string))
Lucifer
  • 156
  • 4
  • 15
  • i ran the code but it is giving me error. in object_key, should i give to full path of file or just file name? – abhishek Sep 27 '19 at 10:21
  • i'm getting this error- botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. – abhishek Sep 27 '19 at 10:22
  • Error is for object key or is it for the AWS ID? – Lucifer Feb 11 '20 at 12:13