Why is conversion of Dask dataframe to pandas dataframe really slow?

Question

I'm getting pandas dataframe from dask using

p_df_data=d_df_data.compute()

But this is really slow... Is there an alternative method?

This really depends on what transformations you are applying. Related: [why is multiprocessing slower than a simple computation in Pandas?](https://stackoverflow.com/questions/49837539/why-is-multiprocessing-slower-than-a-simple-computation-in-pandas) — jpp, Sep 28 '18 at 12:22

score 2 · Accepted Answer · answered Sep 29 '18 at 00:06

2

Dask dataframes are lazy, all operations are free until you call compute, at which point they all occur.

answered Sep 29 '18 at 00:06

MRocklin

1 Answers1