3

Use case: Lets say that I have a very large set of Pandas Series objects and I wish to apply the .drop_na() method to all of them. However because they are so large I must use a function that will perform this with multiprocessing, by accepting a list of objects and using each object in the list as the argument to a method (which is also passed as an object).

Given that the first, and implicit, argument to an object method is self.

Can I use a lambda, partial or other trickery to pass the .drop_na() method and replace the self argument in .drop_na() with a specific instance of a series? Is this possible in Python?

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
Jeremy Barnes
  • 642
  • 1
  • 9
  • 24

1 Answers1

2

Yes. The first argument is only "implicit" when the method is bound to an object that is an instance of the class using . notation, e.g.

sr = Series(...)
sr.drop_na()

In this case, sr.drop_na is actually a closure that replaces the self parameter of Series.drop_na with a reference to sr, sort of like

lambda: Series.drop_na(sr)

You can always call the method from the class directly and pass the self parameter explicitly. The example above can be rewritten as:

sr = Series(...)
Series.drop_na(sr)

You can pass the function Series.drop_na around as necessary and call it on whatever objects you need. Notice that Series.drop_na takes a single parameter, while sr.drop_na does not.

This is not trickery and is in fact a deliberate design choice in the language that is intended to be used for cases just like yours (and many others as well).

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • Thank you. I feel like closures in python are not discussed enough. – Jeremy Barnes Apr 06 '17 at 19:52
  • 1
    Yes, the rules can be a bit obscure at times. This particular one is pretty simple, but it took me a long time to realize that a *different* closure is created every time you access a method via `.` on the object: http://stackoverflow.com/q/41900639/2988730 – Mad Physicist Apr 06 '17 at 20:00
  • 1
    Nitpick: `sr.drop_na` is the closure; `sr.drop_na()` is the application of the closure to its remaining required arguments (of which there are none). – chepner Apr 06 '17 at 21:11
  • @chepner. Good catch, fixed. – Mad Physicist Apr 06 '17 at 21:13