18

I need to groupby-transform a dataframe by a datetime column AND another str(object) column to apply a function by group and asign the result to each of the row members of the group. I understand the groupby workflow but cannot make a pandas.Grouper for both conditions at the same time. Thus:

How to use pandas.Grouper on multiple columns?

pablete
  • 1,030
  • 1
  • 12
  • 21

2 Answers2

29

Use the DataFrame.groupby with a list of pandas.Grouper as the by argument like this:

df['result'] = df.groupby([
                 pd.Grouper('dt', freq='D'),
                 pd.Grouper('other_column')
               ]).transform(foo)
pablete
  • 1,030
  • 1
  • 12
  • 21
3

If your second column is a non-datetime series, you can group it with a date-time column like this:

df['res'] = df.groupby([
                 pd.Grouper('dt', freq='D'),
                 'other_column'
               ]).transform(foo)

Note that in this case you don't have to use pd.Grouper for second column beacuse its a string object and not a time object. pd.Grouper is only compatible with datetime columns.

Usman Ahmad
  • 376
  • 4
  • 13
  • is it "you don't have to" or "is only compatible"? could you clarify that? The former would imply that the current accepted answer is valid, the latter would imply that the current accepted answer is invalid. – pablete Sep 09 '21 at 23:11
  • I tried the accepted answer and it didn't work for me because, In my case (and also the case with the question posted here), the 'other_column' was not a date-time series. Instead it was a 'string' series as asked here. pd.Grouper only accepts a datetime series. If you give it something else, like a string series, it won't work and throw an exception. Hope that clarifies. – Usman Ahmad Sep 10 '21 at 07:29
  • Thanks for the comment. It's weird because from the docs (https://pandas.pydata.org/docs/reference/api/pandas.Grouper.html) it does seems like Grouper can take the name of the column too. – pablete Sep 10 '21 at 19:25
  • what does foo mean after transform? – Gustavo Zárate Aug 01 '22 at 22:54
  • @GustavoZárate it's a placeholder for a function to feed into the GroupBy.transform method. transform will summarize values in the same group using the foo function BUT return the same result in a series with the same index (and thus, the same length and order) of the original DataFrame. More info in the docs: https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#transformation – pablete Aug 12 '22 at 02:18