Working through this: https://medium.com/@wangyuw/data-reshaping-with-pandas-explained-80b2f51f88d2
Everything works, but the following line of code generates a warning:
agg = long_df.reset_index().groupby(['RegionVariable', 'EXP'])[features].agg({'count': len, 'mean': np.mean})
The warning it creates is:
FutureWarning: using a dict with renaming is deprecated and will be removed
in a future version.
For column-specific groupby renaming, use named aggregation
>>> df.groupby(...).agg(name=('column', aggfunc))
return super().aggregate(arg, *args, **kwargs)
I tried to 'fix' it with this:
agg = long_df.reset_index().groupby(['RegionVariable', 'EXP'])[features].agg(name=(('count', len), ('mean', np.mean)))
But I get this error:
KeyError: "Column '('count', <built-in function len>)' does not exist!"
How can len not exist in the second but it works in the first?
More to the point, what is the correct syntax to get this working without generating the deprecation warning?
Versions:
Python: 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Intel)]
NumPy: 1.18.1
Pandas: 0.25.3