Should you declare private instance variables of a class in the init function? My code works perfectly fine without doing this, but PyCharm tells me to do this when highlighting warnings.
Asked
Active
Viewed 437 times
0
-
Does this answer your question? [Why are Python's 'private' methods not actually private?](https://stackoverflow.com/questions/70528/why-are-pythons-private-methods-not-actually-private) and [Does Python have “private” variables in classes?](https://stackoverflow.com/questions/1641219/does-python-have-private-variables-in-classes) – Peter Badida Apr 30 '22 at 13:13
-
2Could you add the code you have, and the modification PyCharm suggests? – jthulhu Apr 30 '22 at 13:17
-
conventionally python recommends using single underscore at the prefix of variable to make them private. – Deepak Tripathi Apr 30 '22 at 13:25
-
2@PeterBadida No.actually not. It is not about privacy of variables. And we all know what _dunder_ means. This is more about: Why is *PyCharm* complaining ? And therefore we need more information. – Thomas Junk Apr 30 '22 at 13:26
-
It is good practice to set all instance variables (private or not) in `__init__`. Otherwise it is hard to tell if a variable exists when a particular method is called and can lead to errors because a variable wasn't set or to additional code to check for existence. – Michael Butscher Apr 30 '22 at 13:27
-
1@DeepakTripathi: Nope. A single underscore means "protected, not part of public API", but two underscores (with no trailing underscore) means "private". The latter uses name mangling so two classes in a hierarchy can each have an instance attribute of the "same" name without colliding; methods defined in class `Parent` only see `Parent`'s version, methods in `Child` only see `Child`'s version. – ShadowRanger Apr 30 '22 at 13:29
-
This topic is debatable @ShadowRanger in docs its clearly stated that there is no concept of private """“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python""". If two underscore means private and I can access it through name mangling then whats the point of private ? – Deepak Tripathi Apr 30 '22 at 13:37
-
1@DeepakTripathi: Privacy exists as much as it does in any language. Private doesn't actually mean anything *security-wise* in most languages, but people think it does because it's sufficiently difficult to violate the privacy (still easily done in most languages with raw pointers or reflection). Python dispenses with the charade; privacy exists solely to enable the use case of attributes that are only *naturally* visible to a single class and its methods, w/o the risk of overlapping those of a child class. The name-mangling makes it API private, without pretending to be a security protection. – ShadowRanger Apr 30 '22 at 13:43
1 Answers
3
It's generally considered good practice to assign to all instance variables in __init__
, even if some of them are lazily given real values and all you can do in __init__
is give them a sentinel that means "No value here yet" (e.g. None
). There are two reasons for this:
- Maintainer benefit: If you don't follow this guideline, determining the complete set of attributes the class may have involves reading the entire class to look for lazily added attributes. It's a lot easier if maintainers can count on
__init__
to provide the complete set of attributes, even if some of them are given real values elsewhere. - (On modern CPython, as an implementation detail) Reduced memory usage: When all instances of a class are given the same set of attributes, in the same order, and the set of attributes is not modified unpredictably after
__init__
(it's okay to reassign an attribute, just not to add or delete attributes), CPython uses a key-sharing dictionary to hold the attributes for each instance. The hash table itself that stores the keys ends up shared, tied to the class itself, and only the cheap array containing the values for the instance's attributes ends up costing memory. For the case of a class with a single attribute, this reduces the per-instance__dict__
size from 232 bytes to 104, and the ratio remains similar as the number of attributes grows (the key-sharing__dict__
costs less than half as much memory as a non-key-sharing__dict__
).

ShadowRanger
- 143,180
- 12
- 188
- 271
-
Does this not apply to private methods? PyCharm only seems to suggest doing so for private instance variables for public methods. – fAXvw5Le Apr 30 '22 at 13:35
-
@fAXvw5Le: It applies to both types of *attribute* (not method; methods are not defined in `__init__`), but at least from an *external* user's perspective (as opposed to someone maintaining the class itself), there's no observable difference whether or not an attribute exists, so if the class really *wants* to make itself harder to maintain and less efficient, it can do so without presenting an unpredictable set of attributes to consumers of the class; the people maintaining the class suffer, but the users of the class won't see the difference either way. – ShadowRanger Apr 30 '22 at 13:38
-
I meant private attributes of a non-public method, not the method itself. – fAXvw5Le Apr 30 '22 at 13:41
-
1@fAXvw5Le: That makes... no sense. While methods *can* have attributes (they have a `__dict__` and you can assign arbitrary things to it), the idea of giving them *private* attributes is nonsensical (you don't define the function type that all user-defined methods are an instance of, and private attributes are solely for stuff defined by that class). You're going to need to be a *lot* more clear in your question (including code and PyCharm's warning as part of a [MCVE]) to have a chance of us figuring out what terminology mistakes you've locked in. – ShadowRanger Apr 30 '22 at 13:46
-
https://pastebin.com/iMNLb24K Here's my code after fixing it so PyCharm doesn't give a warning. Everything in __init__ except self.__data was added due to PyCharm wanting to add it there. – fAXvw5Le Apr 30 '22 at 13:51
-
1@fAXvw5Le: Okay, so private attributes were in fact involved, and PyCharm complained they weren't initialized. That tracks with my reasons; PyCharm is trying to make your code maintainer and memory-friendly by consistently initializing all attributes to *something* in `__init__`. Ideally, you'd use this initialization to make more of your methods cache results (e.g. `__get_compressed_streams` should check if `self.__compressed_streams` is `None` or not, and only do all the work it's doing when it's `None`, so future calls don't redo the work needlessly). – ShadowRanger Apr 30 '22 at 14:52
-
Alternatively, if you don't want that caching (or it would be incorrect; a subsequent call can produce different results), then stop making stuff like `__compressed_streams` and instance attribute at all; `__get_compressed_streams` already returns the data, so instead of storing it as instance attributes, just have the caller cache it to a local (which disappears when the function returns). There's really only one reason to use lazily initialized attributes like this (to enable a cache); if you're not caching, just pass args and return results, don't rely on side-effects on your state. – ShadowRanger Apr 30 '22 at 14:55
-
1On the topic of "don't make stuff instance attributes", you have at least one "attribute" that should be a local variable; `self.__offset` is *always* initialized to a new value in each method that sets it, used, and then never used elsewhere. It has no business being part of the instance state when it's recomputed every time. So get rid of it, deleting the initialization in `__init__`, and replacing *all* other uses of `self.__offset` with just `offset` (no `self`, no `__`). Do the same for any other instance attributes that aren't really part of the state. Code will be simpler and faster. – ShadowRanger Apr 30 '22 at 15:00