write dask dataframe to one file

Question

I can write a massive dask data frame to disk like so:

raw_data.to_csv(r'C:\Bla\SubFolder\*.csv')

This produces chunked data of the original (massaged) dataset in the subfolder:

C:\Bla\SubFolder\

Just wondering, can I force dask to write the data as one file?

Possible duplicate of [Writing Dask partitions into single file](https://stackoverflow.com/questions/39566809/writing-dask-partitions-into-single-file) — MRocklin, Aug 09 '18 at 14:00
@MRocklin thanks but is this really a solution? write everything in chunks and then put it all together again? — cs0815, Aug 09 '18 at 15:05

score 1 · Answer 1 · answered May 27 '21 at 16:09

1

To save everything into a single file one needs to pass single_file=True as

df.to_csv(r'C:\Bla\SubFolder\*.csv', single_file = True)

answered May 27 '21 at 16:09

Gonçalo Peres

1 Answers1