I am using boto
to read a csv
file and parse it contents. This is the code I wrote:
import boto
from boto.s3.key import Key
import pandas as pd
import io
conn = boto.connect_s3(keyId, sKeyId)
bucket = conn.get_bucket(bucketName)
# Get the Key object of the given key, in the bucket
k = Key(bucket, srcFileName)
content = k.get_contents_as_string()
reader = pd.read_csv(io.StringIO(content))
for row in reader:
print(row)
But I am getting error at read_csv line:
TypeError: initial_value must be str or None, not bytes
How can I resolve this error and parse the contents of the csv file present on S3
UPDATE: if I use BytesIO
instead of StringIO
then the print(row)
line only prints 1st row of the csv. How do I loop over it?
This is my current code:
import boto3
s3 = boto3.resource('s3',aws_access_key_id = keyId, aws_secret_access_key = sKeyId)
obj = s3.Object(bucketName, srcFileName)
content = obj.get_contents_as_string()
reader = pd.read_csv(io.BytesIO(content), header=None)
count = 0
for index, row in reader.iterrows():
print(row[1])
When I execute this I get AttributeError: 's3.Object' object has no attribute 'get_contents_as_string'
error