1

The code below reads a CSV file from AWS s3 using Pycham on my local machine.

# Read CSV from s3

import os
import boto3
import pandas as pd
import sys

if sys.version_info[0] < 3:
    from StringIO import StringIO  # Python 2.x
else:
    from io import StringIO

aws_id = 'XXXXXXXXXXXXXXX'
aws_secret = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

client = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)

bucket_name = 'bucket-name'
object_key = 'folder-name/test.csv'

csv_obj = client.get_object(Bucket=bucket_name, Key=object_key)

body = csv_obj['Body']

csv_string = body.read().decode('utf-8')

df = pd.read_csv(StringIO(csv_string))

x = df.head()

print(x)

I would like to be able to read multiple CSV files in the same way. Pretty much anything that is in the folder.

My files are in the following directory:

bucket-name/folder-name/year=2018/month=01/file_032342.csv
bucket-name/folder-name/year=2018/month=02/file_434423.csv
bucket-name/folder-name/year=2018/month=03/file_343254.csv
bucket-name/folder-name/year=2018/month=04/file_544353.csv
Data_101
  • 893
  • 7
  • 14
  • 25
  • 3
    Possible duplicate of [Reading multiple csv files from S3 bucket with boto3](https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3) – vielkind Jan 16 '19 at 16:35
  • 3
    Not really good practice to store your `aws_id` and `aws_secret` in your python file. Store it in a separate config file and never put it to version control – DollarAkshay Jan 16 '19 at 16:36
  • What is your actual question/problem? You say "I would like to be able to read multiple CSV files". Have you tried it? Did you encounter a problem? – John Rotenstein Jan 16 '19 at 23:19

0 Answers0