Bigquery to S3 File upload using Python: raise ValueError('Filename must be a string')

Question

I'm trying to extract a data from BQ and save it to CSV and then upload it to s3, but I'm having error with the uploading to s3. This is the error I get when I run the script:

raise ValueError('Filename must be a string')

If you can please help me solve this issue, I'm new to Python and AWS. Thank you

Script is:



    rows_df = query_job.result().to_dataframe() 
    file_csv = rows_df.to_csv(s3_filename, sep='|', index=False, encoding='utf-8')
    s3.upload_file(file_csv, s3_bucket, file_csv)

Milan Cermak · Accepted Answer · 2020-06-07T18:35:52.867

2

Try changing the arguments passed to s3.upload_file like so:

s3.upload_file(s3_filename, s3_bucket, s3_filename)

The to_csv writes the dataframe to a local file at path s3_filename and file_csv is None. Alternatively, if your dataframe is small enough to be held in memory, the following should do the trick:

import io
data = rows_df.to_csv(sep='|', index=False, encoding='utf-8')
data_buffer = io.BytesIO(data)
s3.upload_fileobj(data_buffer, s3_bucket, s3_filename)

edited Jun 07 '20 at 18:35

answered Jun 07 '20 at 15:22

Milan Cermak

7,476
3
44
59

whats the difference between upload_file and upload_fileobj? I tried running the script with your suggestion and it return the error ValueError: Fileobj must implement read. – Justine Jun 07 '20 at 17:45
Here's a SO answer to that https://stackoverflow.com/questions/52336902/what-is-the-difference-between-s3-client-upload-file-and-s3-client-upload-file I've updated the second code example, the data_buffer is now an file-like object that upload_fileobj accepts. – Milan Cermak Jun 07 '20 at 18:35

score 0 · Answer 2 · answered Jun 07 '20 at 15:22

Based on pandas doc, to_csv returns None when path_or_buf is specified. However, upload_file needs a filename and a S3 key in its first and third argument respectively. Therefore, something like this could make this work.

s3.upload_file(s3_filename, s3_bucket, s3_filename)

Bigquery to S3 File upload using Python: raise ValueError('Filename must be a string')

2 Answers2