12

I'm learning Python and I have a question, more theoretical than practical, regarding access class variables from method of this class.

For example we have:

class ExampleClass:
    x = 123
    def example_method(self):
        print(self.x)

Why is necessarily to write exactly self.x, not just x? x belongs to namespace of the class, and method using it belongs to it too. What am I missing? What a rationale stands behind such style?

In C++ you can write:

class ExampleClass {
public:
    int x;
    void example_method()
    {
        x = 123;
        cout << x;
    };
};

And it will work!

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Gill Bates
  • 14,330
  • 23
  • 70
  • 138
  • I think it has to do with scope. Inside a module, you can declare a "global" variable and call it inside a class declaration, but if you declare something inside a class declaration, (apparently) it already gets "caught" by the class scope, and then needs the self. – heltonbiker Nov 30 '12 at 19:48
  • @senderle That question seems to be mainly about explicitly passing `self` (or at least that's what the accepted answer focuses on). This question is about why there's no implicit `self.` for attributes. –  Nov 30 '12 at 19:56
  • PS. naming it “self” is just a convention and good practice, it can be anything, like unicorns. – Chris Warrick Nov 30 '12 at 19:57
  • @delnan, practically speaking, I think both questions amount to "Why is necessary to write exactly 'self.x', not just 'x'?" But if no one agrees with me, then this won't be closed :) -- I just thought I'd raise the possibility that this is a duplicate. (And rereading the other question, I'd agree that it's ambiguous.) – senderle Nov 30 '12 at 20:00
  • 2
    If you do not accept Bakuriu's answer I will hunt you down. How refreshing to have an actual answer to a design decision question. – Mark Ransom Nov 30 '12 at 20:13
  • @senderle No it isn't a dup of the question you reference. In the above Gill Bates' question, there's an assignement instruction inside the class and outside the method, while in the referenced question there is not. – eyquem Nov 30 '12 at 22:12
  • @eyquem, yeah, I see what you mean. It's still a bit ambiguous because the c++ code creates an _instance_ attribute, not a class (a.k.a. `static`) attribute -- so it's still not totally clear that this question is about class attributes rather than instance attributes. But I agree that it's cause for reasonable doubt. – senderle Dec 01 '12 at 00:06
  • @senderle Question you refer as dup is about what purpose of "self" at all, my question is more specific and particular - why "self" enforced into explicit way of use. And accordingly answers on my questions have more specific and detailed nature. Merged into more common and blurry topic they would lost their value and sense. – Gill Bates Dec 01 '12 at 08:51

4 Answers4

30

From The History of Python: Adding Support for User-defined Classes:

Instead, I decided to give up on the idea of implicit references to instance variables. Languages like C++ let you write this->foo to explicitly reference the instance variable foo (in case there’s a separate local variable foo). Thus, I decided to make such explicit references the only way to reference instance variables. In addition, I decided that rather than making the current object ("this") a special keyword, I would simply make "this" (or its equivalent) the first named argument to a method. Instance variables would just always be referenced as attributes of that argument.

With explicit references, there is no need to have a special syntax for method definitions nor do you have to worry about complicated semantics concerning variable lookup. Instead, one simply defines a function whose first argument corresponds to the instance, which by convention is named "self." For example:

def spam(self,y):
    print self.x, y

This approach resembles something I had seen in Modula-3, which had already provided me with the syntax for import and exception handling. Modula-3 doesn’t have classes, but it lets you create record types containing fully typed function pointer members that are initialized by default to functions defined nearby, and adds syntactic sugar so that if x is such a record variable, and m is a function pointer member of that record, initialized to function f, then calling x.m(args) is equivalent to calling f(x, args). This matches the typical implementation of objects and methods, and makes it possible to equate instance variables with attributes of the first argument.

So, stated by the BDFL himself, the only real reason he decided to use explicit self over implicit self is that:

  • it is explicit
  • it is easier to implement, since the lookup must be done at runtime(and not at compile time like other languages) and having implicit self could have increased the complexity(and thus cost) of the lookups.

Edit: There is also an answer in the Python FAQ.

Bakuriu
  • 98,325
  • 22
  • 197
  • 231
  • thanks for this quote. i was looking for reasoning behind this useless "self". `it is easier to implement` - we call it "put a burden on client" ;) – Max Jun 05 '22 at 15:35
7

It seems to be related to module vs. class scope handling, in Python:

COLOR = 'blue'

class TellColor(object):
    COLOR = 'red'

    def tell(self):
        print self.COLOR   # references class variable
        print COLOR        # references module variable

a = TellColor()
a.tell()

> red
> blue
heltonbiker
  • 26,657
  • 28
  • 137
  • 252
6

Here's the content I did in an ancient answer concerning this feature:


The problem you encountered is due to this:

A block is a piece of Python program text that is executed as a unit. The following are blocks: a module, a function body, and a class definition.

(...)

A scope defines the visibility of a name within a block.

(...)

The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods – this includes generator expressions since they are implemented using a function scope. This means that the following will fail:

class A:

   a = 42  

   b = list(a + i for i in range(10))

http://docs.python.org/reference/executionmodel.html#naming-and-binding

The above means: a function body is a code block and a method is a function, then names defined out of the function body present in a class definition do not extend to the function body.


It appeared strange to me, when I was reading this, but that's how Python is crafted:

The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods

That's the official documentation that says this.

.

EDIT

heltonbiker wrote an interesting code:

COLOR = 'blue'

class TellColor(object):
    COLOR = 'red'

    def tell(self):
        print self.COLOR   # references class variable
        print COLOR        # references module variable

a = TellColor()
a.tell()

> red
> blue

It made me wonder how the instruction print COLOR written inside the method tell() provokes the printing of the value of the global object COLOR defined outside the class.
I found the answer in this part of the official documentation:

Methods may reference global names in the same way as ordinary functions. The global scope associated with a method is the module containing its definition. (A class is never used as a global scope.) While one rarely encounters a good reason for using global data in a method, there are many legitimate uses of the global scope: for one thing, functions and modules imported into the global scope can be used by methods, as well as functions and classes defined in it. Usually, the class containing the method is itself defined in this global scope (...)

http://docs.python.org/2/tutorial/classes.html#method-objects

When the interpreter has to execute print self.COLOR, as COLOR isn't an instance attribute (that is to say the identifier 'COLOR' doesn't belong to the namespace of the instance), the interpreter goes in the namespace of the class of the instance in search for the identifier 'COLOR' and find it, so it prints the value of TellColor.COLOR

When the interpreter has to execute print COLOR, as there is no attribute access written in this instruction, it will search for the identifier 'COLOR' in the global namespace, which the official documentation says it's the module's namespace.

Community
  • 1
  • 1
eyquem
  • 26,771
  • 7
  • 38
  • 46
  • I have seen a lot of domain-specific modules (geospatial, for example) with named constants declared globally inside the module (for example, `EARTH_RADIUS`. If someone really wants to use this it's just import the module, like `from Geostuff import EARTH_RADIUS` or `import Geostuff; a = Geostuff.EARTH_RADIUS`. – heltonbiker Dec 01 '12 at 00:07
3

What attribute names are attached to an object (and its class, and the ancestors of that class) is not decidable at compile time. So you either make attribute lookup explicit, or you:

  • eradicate local variables (in methods) and always use instance variables. This does no good, as it essentially removes local variables with all their advantages (at least in methods).
  • decide whether a base x refers to an attribute or local at runtime (with some extra rules to decide when x = ... adds a new attribute if there's no self.x). This makes code less readable, as you never know which one a name is supposed to be, and essentially turns every local variable in all methods into part of the public interface (as attaching an attribute of that name changes the behavior of a method).

Both have the added disadvantage that they require special casing for methods. Right now, a "method" is just a regular function that happens to be accessible through a class attribute. This is very useful for a wide variety of use good cases.

  • Might mention why this still does work out for global variables, which have basically the same situation (actually, there's a keyword to distinguish: ``global``) – Jonas Schäfer Nov 30 '12 at 19:51
  • @JonasWielicki Which variables belong to which scope is decidable at compile time (if it's assigned to, it's local; if not, it's in the innermost scope where it's assigned to, defaulting to global; `global` and `nonlocal` override that). –  Nov 30 '12 at 19:52
  • Ah yes you're right. I always forget the UnboundLocalError you get when reading from a variable which is available in the global scope, but to which you assign a value later on in the function. – Jonas Schäfer Nov 30 '12 at 19:59
  • @delnan I don't understand what means _decidable at compile time_. Decidable by what ? – eyquem Nov 30 '12 at 21:03
  • @eyquem Decidable by the compiler. Or by any program really. Or humans, for that matter. Consider a program that does `name = random_identifier(); setattr(something, name, value)`. –  Nov 30 '12 at 21:08
  • @delnan Has the compiler the ability to decide something of the algorithm ? "Any" program ?? Which one, how ?? By humans ??? at the time of compilation !?? – eyquem Nov 30 '12 at 21:49
  • @eyquem The Python compiler, when compiling Python code, detects which variables are local and which ones are global (and which ones are "nonlocal"). Neither the current compiler, nor any other program, can detect (in general) which objects have which attributes. –  Nov 30 '12 at 21:52
  • @delnan I don't understand why you write at one moment _Which variables belong to which scope is decidable at compile time_ and at another moment _Neither the current compiler, (...), can detect (in general) which objects have which attributes_ That seems to me contradictory. By the way, using _(in general)_ gives a waving sense to the sentence - But in fact, I don't understand what means _decidable at compile time_ . - However I think that this subject is too hard for me and I don't want to plunge in an 'extended discussion in comments'. I'll try to understand elsewhere – eyquem Nov 30 '12 at 22:06