16

Why can two functions with the same id value have differing attributes like __doc__ or __name__?

Here's a toy example:

some_dict = {}
for i in range(2):
    def fun(self, *args):
        print i
    fun.__doc__ = "I am function {}".format(i)
    fun.__name__ = "function_{}".format(i)
    some_dict["function_{}".format(i)] = fun

my_type = type("my_type", (object,), some_dict)
m = my_type()

print id(m.function_0)
print id(m.function_1)
print m.function_0.__doc__
print m.function_1.__doc__
print m.function_0.__name__
print m.function_1.__name__
print m.function_0()
print m.function_1()

Which prints:

57386560
57386560
I am function 0
I am function 1
function_0
function_1
1 # <--- Why is it bound to the most recent value of that variable?
1

I've tried mixing in a call to copy.deepcopy (not sure if the recursive copy is needed for functions or it is overkill) but this doesn't change anything.

ely
  • 74,674
  • 34
  • 147
  • 228
  • "Why is it bound to the most recent value of that variable?", because of the closure and the dereffered storage of `i`. – Hyperboreus Mar 21 '14 at 18:45
  • Why doesn't the same thing happen with `fun.__doc__`, which also depends on `i`? – ely Mar 21 '14 at 18:46
  • 2
    Because `format` is evaluated immediately, while the body of `fun` is not. – Hyperboreus Mar 21 '14 at 18:47
  • You want to keep to *one* question per post, really. There is a partial dupe here for [Local variables in Python nested functions](http://stackoverflow.com/q/12423614) – Martijn Pieters Mar 21 '14 at 18:48

4 Answers4

18

You are comparing methods, and method objects are created anew each time you access one on an instance or class (via the descriptor protocol).

Once you tested their id() you discard the method again (there are no references to it), so Python is free to reuse the id when you create another method. You want to test the actual functions here, by using m.function_0.__func__ and m.function_1.__func__:

>>> id(m.function_0.__func__)
4321897240
>>> id(m.function_1.__func__)
4321906032

Method objects inherit the __doc__ and __name__ attributes from the function that they wrap. The actual underlying functions are really still different objects.

As for the two functions returning 1; both functions use i as a closure; the value for i is looked up when you call the method, not when you created the function. See Local variables in Python nested functions.

The easiest work-around is to add another scope with a factory function:

some_dict = {}
for i in range(2):
    def create_fun(i):
        def fun(self, *args):
            print i
        fun.__doc__ = "I am function {}".format(i)
        fun.__name__ = "function_{}".format(i)
        return fun
    some_dict["function_{}".format(i)] = create_fun(i)
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • On the last comment, is there a way (besides placing the variable within the function signature) to memoize the value of `i` to the function, so that it looks up the value that was present when it was created. I think that is the real question I have, but I did not realize it until after reading your answer to my first layer of confusion. – ely Mar 21 '14 at 18:52
  • 1
    @EMS: See the linked post; I present you with several options there. If that's your real question, then it is a dupe of the linked post. – Martijn Pieters Mar 21 '14 at 18:53
  • All of them seem to require that the variable-to-be-bound is passed as an argument. Whereas, I'm trying to make some functions that have the same body and the function signature is largely the same, but the argument signature will be different. – ely Mar 21 '14 at 18:57
  • @EMS: The first option creates a new scope; each time you call that scope the closure is bound in that scope. The resulting function object has no `i` in the signature, for example. – Martijn Pieters Mar 21 '14 at 18:58
  • In my case, the real goal is to write functions that read about database connections and queries to perform, as well as the fields that are allowed to vary in the query. The logic is identical based on the query and connection info, and the only thing that needs to change is the set of keyword args. I suppose making extra args with defaults set might be OK, but will lead to some superfluous args. – ely Mar 21 '14 at 18:58
  • @EMS: Ah, sorry, I meant the second option; `partial()` objects cannot be used as methods (they don't support the descriptor protocol). That option produces a new function to add to your class, but the closure there will never vary as it is a new local in the factory function. – Martijn Pieters Mar 21 '14 at 19:01
  • Thanks for all of the clarifications. The linked answer is extremely valuable in its thoroughness. – ely Mar 21 '14 at 20:30
3

Per your comment on ndpu's answer, here is one way you can create the functions without needing to have an optional argument:

for i in range(2):
    def funGenerator(i):
        def fun1(self, *args):
            print i
        return fun1
    fun = funGenerator(i)
    fun.__doc__ = "I am function {}".format(i)
    fun.__name__ = "function_{}".format(i)
    some_dict["function_{}".format(i)] = fun
Rob Watts
  • 6,866
  • 3
  • 39
  • 58
  • @EMS: this uses the second option in the dupe post; a factory function creating a new scope. – Martijn Pieters Mar 21 '14 at 19:07
  • One issue I am thinking about with this approach: What if `self` is not known ahead of time. In this example, how does the function call `funGenerator(i)` work? It should expect `self` to be the first argument, right? How to get around that? – ely Mar 21 '14 at 20:11
  • 1
    @EMS Good catch. Actually, I really didn't need to give `funGenerator` any arguments other than `i`. I've updated my answer to reflect that. – Rob Watts Mar 21 '14 at 20:15
2

You should save current i to make this:

1 # <--- Why is it bound to the most recent value of that variable?
1

work, for example by setting default value to function argument:

for i in range(2):
    def fun(self, i=i, *args):
        print i
# ...

or create a closure:

for i in range(2):
    def f(i):
        def fun(self, *args):
            print i
        return fun
    fun = f(i)
# ...
ndpu
  • 22,225
  • 6
  • 54
  • 69
2

@Martjin Pieters is perfectly correct. To illustrate, try this modification

some_dict = {}

for i in range(2):
    def fun(self, *args):
        print i

    fun.__doc__ = "I am function {}".format(i)
    fun.__name__ = "function_{}".format(i)
    some_dict["function_{}".format(i)] = fun
    print "id",id(fun)

my_type = type("my_type", (object,), some_dict)
m = my_type()

print id(m.function_0)
print id(m.function_1)
print m.function_0.__doc__
print m.function_1.__doc__
print m.function_0.__name__
print m.function_1.__name__
print m.function_0()
print m.function_1()

c = my_type()
print c
print id(c.function_0)

You see that the fun get's a different id each time, and is different from the final one. It's the method creation logic that send's it pointing to the same location, as that's where the class's code is stored. Also, if you use the my_type as a sort of class, instances created with it have the same memory address for that function

This code gives:
id 4299601152
id 4299601272
4299376112
4299376112

I am function 0
I am function 1
function_0
function_1
1
None
1
None
<main.my_type object at 0x10047c350>
4299376112

Simon
  • 2,840
  • 2
  • 18
  • 26