I have a couple of questions on using groupby on dask dataframes. As I understand it, iterating on a groupby result like one does in Pandas doesn't work in dask i.e.
for name, group in sorted(grouped.groups):
logger.info((name, group))
isn't allowed. We're supposed to use apply
instead.
However, in Pandas if I wanted to find out the number of groups I could do the following:
len(grouped.groups)
By using apply
, I would expect to be able to do this for a groupby on a dask dataframe:
d_grouped.apply(len)
But that doesn't work. How can I find out the number of groups resulting from a groupby on a dask dataframe ?