This question builds on this one asked earlier and also this one, but addresses a more subtle point, specifically, what counts as an "internal class"?
Here's my situation. I'm building a Python library for manipulating MS Office files in which many of the classes are not meant to be constructed externally, yet many of their methods are important parts of the API. For example:
class _Slide(object):
def add_shape(self):
...
_Slide
is not to be constructed externally, as an end-user of the library you get a new slide by calling Presentation.add_slide()
. However, once you have a slide you definitely want to be able to call add_shape()
. So the method is part of the API but the constructor isn't. This situation arises dozens of times in the library because only the Presentation class has an external/API constructor.
PEP8 is ambiguous on this point, making reference to "internal classes" without elaboration of what counts as an internal class. In this case I suppose one could say the class is "partially internal".
The problem is I only have two notations to distinguish at least three different situations:
- The constructor and "public/API" methods of the class are all available for use.
- The constructor is not to be used, but the "public/API" methods of the class are.
- The class really is internal, has no public API, and I reserve the right to change or remove it in a future release.
I note that from a communications perspective, there is a conflict between clear expression of the external API (library end-user audience) and the internal API (developer audience). To an end user, calling _Slide()
is verboten/at-own-risk. However it's a happy member of the internal API and distinct from _random_helper_method()
which should only be called from within _Slide()
.
Do you have a point of view on this question that might help me? Does convention dictate that I use my "single-leading-underscore" ammunition on class names to fight for clarity in the end-user API or can I feel good about reserving it to communicate to myself and other developers about when a class API is really private and not to be used, for example, by objects outside the module it lives in because it is an implementation detail that might change?
UPDATE: After a few years of further reflection, I've settled into the convention of using a leading underscore when naming classes that are not intended to be accessed as a class outside their module (file). Such access is typically to instantiate an object of that class or to access a class method, etc.
This provides users of the module (often yourself of course :) the basic indicator: "If a class name with a leading underscore appears in an import statement, you're doing something wrong." (Unit test modules are an exception to this rule, such imports may often appear in the unit tests for an "internal" class.)
Note that this is access to the class. Access to an object of that class (type) outside the module, perhaps provided by a factory or whatever, is perfectly fine and perhaps expected. I think this failure to distinguish classes from the objects created from them was what led to my initial confusion.
This convention also has the side benefit of not including these classes when making a from module import *
statement. Although I never use these in my own code and recommend avoiding them, it's an appropriate behavior because those class identifiers are not intended to be part of the module interface.
This is my personal "best-practice" after years of trial, and not of course the "right" way. Your mileage may vary.