I am hoping to have a performance gain by using Dask dataframe over Pandas on a 6-core macbook pro. however Dask is performing as slow as the Pandas dataframe, which takes roughly 5 minutes.
What am I doing wrong here?
ddf = dd.from_pandas(df.set_index('customer seq').sort_index(), npartitions = 8)
ddf = ddf.set_index(ddf.index, sorted = True)
paired = ddf.groupby(ddf.index, group_keys =
False).apply(retention_contract).compute(scheduler='processes')