0

I am using the following script to load a csv file around 2G, and after 24 hours nothing happens, am I doing something wrong?

nlp = spacy.load('en_core_sci_lg')
bundle = ''
​
pattern = ""
print('start running')
column_names = ["Origina_subject", "Predicted_subject", "Original_object","Predicted_object",'original_sent']
final_list = []
data_location = 's3://{}/{}'.format(bucket, file_name)
data = pd.read_csv(data_location)

print('finish loading')

this is the response I am getting, clearly not passing load:

arn:aws:iam::0*********
wait
start running
Raha
  • 17
  • 5

1 Answers1

0

You can download the file to a temporal memory file using the boto3 client and read it with pandas

# Read dataframe
import boto3
import pandas as pd
from io import BytesIO
s3 = boto3.resource('s3')
bucket = 'your-bucket'
key = 'some/key/file'
with BytesIO() as data:
    s3.Bucket(bucket).download_fileobj(key, data)
    data.seek(0) # move back to the beginning after writing
    df = pd.read_csv(data)