0

If I define a Python class which has an instance field/variable __MAX_N, it seems Python applies this private name mangling thing. But if I rename it to __MAX_N__ then there's no mangling, and I can access it freely from the outside?! That kind of surprised me so I wonder if __MAX_N__ is supposed to designate (by Python convention) something else say a constant or something else?

So I mean...

  1. Why is __A__ not mangled and __A is?

Also...

  1. Is there mangling for class (non-instance) fields?
  2. Is there mangling for method names?

I mean specifically in Python 3.x, not interested in 2.x.

EDIT:

I was pointed to this part of the Python docs.

Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped

So why are the __A__ instance fields not mangled? What is the idea behind them?

peter.petrov
  • 38,363
  • 16
  • 94
  • 159
  • Any prefix *and* suffix double-dunder method is probably assumed to be a magic method, not a private one. – Jared Smith Jun 26 '20 at 17:30
  • 2
    Double-dunder methods are reserved by the language; you aren't supposed to define your own, so there's no need to decide if they should be treated as public or private. They are *meant* to be overriden, but not called explicitly. – chepner Jun 26 '20 at 18:02
  • Questions 2 and 3 are easy to test. Have you tried them? – wjandrea Jun 27 '20 at 18:18

2 Answers2

3

tl;dr: don't do that.

Avoid leading double __x. Prefer single underscore, _x, for private variables.

The rules are: https://docs.python.org/3/tutorial/classes.html#private-variables

Sometimes there is good reason to design for class inheritance using double underscore prefix. But usually, when you think it might be a good idea, it isn't, and will prove more trouble than it's worth.


Also, dunderscore before and after is usually for python operators, e.g. __add__ invoked by + plus, or __str__.


Do feel free to invent "similar" but non-conflicting names through a single underscore suffix.

Sometimes you really want to call a directory dir, or a zipcode zip. But it's a bad idea to shadow builtins. So the convention is to create a "related" identifier with single _ underscore suffix, like dir_ or zip_. Others that crop up a fair amount: hash_, hex_, id_, map_, max_, min_, range_, sum_. Sometimes there is a natural synonym or abbreviation, so you can neatly sidestep the issue: ch, length, lst, nxt, typ.

The case of a generic dict arises quite often, and naming it simply d typically works fine, much as a generic string will often be named s, or dictionary key → value is k: v.

J_H
  • 17,926
  • 4
  • 24
  • 44
  • @peter.petrov From the link above: _" Any identifier of the form `__spam` (at least two leading underscores, at most one trailing underscore) is textually replaced with `_classname__spam`"_ – Brian61354270 Jun 26 '20 at 17:15
  • Thanks... But if we draw an analogy with Java, isn't `_` supposed to designate protected and `__` private ? I wanted to designate my fields as private so I used the double underscores. But OK... my real question is kind of left unanswered - why is `__A__` not mangled and `__A` is? What are the rules (in not too formal language)? – peter.petrov Jun 26 '20 at 17:15
  • 2
    It's only a loose analogy, because Python doesn't have protected or private. Even the name-mangling is easily bypassed. From the tutorial: "Note that the mangling rules are designed mostly to avoid accidents; it still is possible to access or modify a variable that is considered private. This can even be useful in special circumstances, such as in the debugger." As for why `__A__` isn't mangled, it's because names with *both* a `__` prefix and suffix aren't. – chepner Jun 26 '20 at 17:15
  • Guys, thanks, OK, I understand it's a loose analogy and all that... I just wonder why `__A__` is left not mangled? Does it by convention designate something else in Python e.g. a public constant?! Just guessing... – peter.petrov Jun 26 '20 at 17:16
  • @peter.petrov The answer is "maybe." Identifiers of the form `__A__` are reserved (by convention only) for special uses. As for why it is left unmangled, does it begin with "at least two leading underscores, [and have] at most one trailing underscore"? – Brian61354270 Jun 26 '20 at 17:20
  • @Brian Hm... how are the `__A__` "reversed" if one can freely access them while one cannot access `__A` that freely (from outside the class) because for `__A` there's at least some mangling done there? Sorry... My main language is Java... I just find it hard to follow Python's logic sometimes :) but yeah... I am getting better at it. – peter.petrov Jun 26 '20 at 17:21
  • Java wants to make it provably impossible for code outside a class to access certain aspects of that class, it tries to offer guarantees. In contrast Python asserts that "we're all adults here" and lets you pretty much touch anything, with the understanding that "hey, this area is hands off, you shouldn't be playing around in here!" and then you get what you deserve if you ignored the rules and things didn't go well. Python identifiers starting with a letter are public, and starting with single underscore are private. – J_H Jun 26 '20 at 20:06
  • When defining your public API for others to use, initial letter is what you want. For private helpers and private attributes, start them with a single `_` underscore. Don't do crazy things beyond that. When consuming someone else's library code, try to stick to calling / using identifiers that start with letter, and avoid any that start with single underscore. Python is not Java. Embrace the change, speak like a pythonista! – J_H Jun 26 '20 at 20:12
1

From the documentation "Reserved classes of identifiers":

Certain classes of identifiers (besides keywords) have special meanings. These classes are identified by the patterns of leading and trailing underscore characters:

_*
Not imported by from module import *. [...]

__*__
System-defined names, informally known as “dunder” names. These names are defined by the interpreter and its implementation (including the standard library). Current system names are discussed in the Special method names section and elsewhere. More will likely be defined in future versions of Python. Any use of __*__ names, in any context, that does not follow explicitly documented use, is subject to breakage without warning.

__*
Class-private names. Names in this category, when used within the context of a class definition, are re-written to use a mangled form to help avoid name clashes between “private” attributes of base and derived classes. See section Identifiers (Names).

The order is significant here. Names that match the second pattern are not mangled.

wjandrea
  • 28,235
  • 9
  • 60
  • 81