How to write pyarrow table as csv to s3 directly?

Asked Jun 01 '22 at 18:30

Active Jun 01 '22 at 21:35

Viewed 43 times

I am trying to write a pyarrow table as txt file to s3 bucket .I have multiple chunks of a table and each chunk being its own table This is how im generating the text file.

import pyarrow.csv as csv
for chunk in table_chunks:
     csv.write_csv(chunk[2], 'myfile.txt', csv.WriteOptions(include_header=True if i==0 else 
        False,delimiter='|'))

This however overwrites the file after each chunk is processed.

I want to be able to append each table chunk to the same file in s3. I would really appreciate any help. Thank you

edited Jun 01 '22 at 21:35

asked Jun 01 '22 at 18:30

Hasham

You [cannot append data to an s3 object](https://stackoverflow.com/questions/41783903/append-data-to-an-s3-object). You could write each chunk to a new S3 object, and then assemble them when reading. – larsks Jun 01 '22 at 20:06
I understand but how do I write and pyarrow table to s3 directly? – Hasham Jun 01 '22 at 21:32

How to write pyarrow table as csv to s3 directly?

0 Answers0