3

There is a dataFrame named "subset" and the codes are as follows. pd is the nickname of pandas. I can't figure out the meaning of by = lambda x: lambda y: getattr(y, x).

pivot = pd.pivot_table(subset, values='count', rows=['date'], cols=['sample'], fill_value=0)
by = lambda x: lambda y: getattr(y, x)
grouped = pivot.groupby([by('year'),by('month')]).sum()
user94559
  • 59,196
  • 6
  • 103
  • 103
zhql0907
  • 369
  • 3
  • 11
  • Possible duplicate of [Why are Python lambdas useful?](http://stackoverflow.com/questions/890128/why-are-python-lambdas-useful) – Mephy Aug 19 '16 at 04:47
  • 1
    assuming `year` and `month` are columns I am guessing it is doing the same thing as `grouped = pivot.groupby(['year', 'month']).sum()` – Stefano Potter Aug 19 '16 at 04:49
  • `year` and `month` are not columns. This is the first time they appear in the code and it can run well. There is a column named 'date' in subset and I'm trying to find the relation between them.@StefanoPotter – zhql0907 Aug 19 '16 at 07:35

1 Answers1

2

by = lambda x: lambda y: getattr(y, x) is equivalent to the following:

def by(x):
    def getter(y):
        return getattr(y, x)
    return getter

getattr(a, b) gets an attribute with the name b from an object named a.

So by('bar') returns a function that returns the attribute 'bar' from an object.

by('bar')(foo) means getattr(foo, 'bar') which is roughly foo.bar.

If that doesn't help, let us know which part you're still having trouble with.

user94559
  • 59,196
  • 6
  • 103
  • 103