I am using group by-apply on multiple columns with a user-defined function. However, these columns do not appear in the output data frame, even if I use reset_index. Unfortunately, I could not create a minimal reproducible example without using my own data. My workaround was explicitly adding the group by columns in the output data frame in the user-defined function.
Asked
Active
Viewed 125 times
0
-
1seems like the issue was that the dataframe had an existing index column BEFORE the groupby and dask does not like that. When I did reset_index BEFORE the groupby everything works fine. – Hanan Shteingart Sep 05 '22 at 05:55
-
1Interesting, knowing the fix now, is it possible to come up with a MRE? – SultanOrazbayev Sep 05 '22 at 06:58
-
here's a good guide to [creating reproducible examples in pandas](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) - from there you can just import the dataframe with [`dask.dataframe.from_pandas`](https://docs.dask.org/en/stable/generated/dask.dataframe.from_pandas.html) – Michael Delgado Sep 05 '22 at 07:02