I have the following dataframe, groupby objects, and functions.
df = pd.DataFrame({
'A': 'a a b b b'.split(),
'P': 'p p p q q'.split(),
'B': [1, 2, 3, 4, 5],
'C': [4, 6, 5, 7, 8],
'D': [9, 10, 11, 12, 13]})
g1 = df.groupby('A')
g2 = df.groupby('P')
def f1(x, y):
return sum(x) + sum(y)
def f2(x, y):
return sum(x) - sum(y)
def f3(x, y):
return x * y
For g1, I want to
- apply f1 to columns B and C
- apply f2 to columns C and D.
For g2, I want to
- apply f2 to columns B and C
- apply f3 to columns C and D
To me, the difficulty lies in the functions, which operate on multiple columns. I also need the functions to work for any arbitrary set of columns; notice how f2 is used for ['B', 'C'] and ['C', 'D']. I'm struggling with the syntax to deal with this.
How do I use Pandas to do all of these things in Python?