10

I am studying python. I am trying to understand how to design a library that exposes a public api. I want avoid to expose internal methods that could change in future. I am looking for a simple and pythonic way to do it. I have a library that contains a bunch of classes. Some methods of those classes are used internally among classes. I don't want to expose those methods to the client code.

Suppose that my library (f.e. mylib) contains a class C with two methods a C.public() method thought to be used from client code and C.internal() method used to do some work into the library code. I want to commit myself to the public api (C.public()) but I am expecting to change the C.internal() method in future, for example adding or removing parameters.

The following code illustrates my question:

mylib/c.py:

class C:
    def public(self):
        pass

    def internal(self):
        pass

mylib/f.py:

class F:
    def build():
        c = C()
        c.internal()
        return c

mylib/__init__.py:

from mylib.c import C
from mylib.f import F

client/client.py:

import mylib
f = mylib.F()
c = f.build()
c.public()
c.internal()  # I wish to hide this from client code

I have thought the following solutions:

  • document only public api, warning user in documentation to don't use private library api. Live in peace hoping that clients will use only public api. If the next library version breaks client code is the client fault:).

  • use some form of naming convention, f.e. prefix each method with "_", (it is reserved for protected methods and raises a warning into ide), perhaps I can use other prefixes.

  • use objects composition to hide internal methods. For example the library could return to the clients only PC object that embeds C objects.

mylib/pc.py:

class PC:
    def __init__(self, c):
        self.__c__

    def public(self):
        self.__cc__.public()

But this looks a little contrived.

Any suggestion is appreciated :-)

Update

It was suggested that this question is duplicated of Does Python have “private” variables in classes?

It is similar question but I is a bit different about scope. My scope is a library not a single class. I am wondering if there is some convention about marking (or forcing) which are the public methods/classes/functions of a library. For example I use the __init__.py to export the public classes or functions. I am wondering if there is some convention about exporting class methods or if i can rely only on documentation. I know I can use "_" prefix for marking protected methods. As best as I know protected method are method that can be used in class hierarchy.

I have found a question about marking public method with a decorator @api Sphinx Public API documentation but it was about 3 years ago. There is commonly accepted solution, so if someone are reading my code understand what are methods intended to be library public api, and methods intended to be used internally in the library? Hope I have clarified my questions. Thanks all!

Community
  • 1
  • 1
Giovanni
  • 625
  • 2
  • 6
  • 19

2 Answers2

2

You cannot really hide methods and attributes of objects. If you want to be sure that your internal methods are not exposed, wrapping is the way to go:

class PublicC:
    def __init__(self):
        self._c = C()

    def public(self):
        self._c.public()

Double underscore as a prefix is usually discouraged as far as I know to prevent collision with python internals.

What is discouraged are __myvar__ names with double-underscore prefix+suffix ...this naming style is used by many python internals and should be avoided -- Anentropic

If you prefer subclassing, you could overwrite internal methods and raise Errors:

class PublicC(C):
    def internal(self):
        raise Exception('This is a private method')

If you want to use some python magic, you can have a look at __getattribute__. Here you can check what your user is trying to retrieve (a function or an attribute) and raise AttributeError if the client wants to go for an internal/blacklisted method.

class C(object):
    def public(self):
        print "i am a public method"

    def internal(self):
        print "i should not be exposed"

class PublicC(C):
    blacklist = ['internal']

    def __getattribute__(self, name):
        if name in PublicC.blacklist:
            raise AttributeError("{} is internal".format(name))
        else: 
            return super(C, self).__getattribute__(name) 

c = PublicC()
c.public()
c.internal()

# --- output ---

i am a public method
Traceback (most recent call last):
  File "covering.py", line 19, in <module>
    c.internal()
  File "covering.py", line 13, in __getattribute__
    raise AttributeError("{} is internal".format(name))
AttributeError: internal is internal

I assume this causes the least code overhead but also requires some maintenance. You could also reverse the check and whitelist methods.

...
whitelist = ['public']
def __getattribute__(self, name):
    if name not in PublicC.whitelist:
...

This might be better for your case since the whitelist will probably not change as often as the blacklist.

Eventually, it is up to you. As you said yourself: It's all about documentation.

Another remark:

Maybe you also want to reconsider your class structure. You already have a factory class F for C. Let F have all the internal methods.

class F:
    def build(self):
        c = C()
        self._internal(c)
        return c

    def _internal(self, c):
        # work with c

In this case you do not have to wrap or subclass anything. If there are no hard design constraints to render this impossible, I would recommend this approach.

aleneum
  • 2,083
  • 12
  • 29
  • 2
    "Double underscore as a prefix is usually discouraged as far as I know to prevent collision with python internals." this is not true... double-underscore prefix _is_ the python way of doing private members... outside of your class the name is obfuscated and harder to access. What is discouraged are `__myvar__` names with double-underscore _prefix+suffix_ ...this naming style is used by many python internals and should be avoided – Anentropic Mar 15 '16 at 12:09
  • @aleneum Thank you very much for the answer. – Giovanni Mar 16 '16 at 11:36
  • @Giovanni, you are welcome. Be careful though. You might run into more trouble preventing clients from doing things than if you just let people make their own experiences. Especially __getattribute__ is powerful but also a common source of bugs/errors. Good luck with your project! – aleneum Mar 16 '16 at 14:15
2

I have thought the following solutions:

  • document only public api, warning user in documentation to don't use private library api. Live in peace hoping that clients will use only public api. If the next library version breaks client code is the client fault:).

  • use some form of naming convention, f.e. prefix each method with "_", (it is reserved for protected methods and raises a warning into ide), perhaps I can use other prefixes.

  • use objects composition to hide internal methods. For example the library could return to the clients only PC object that embeds C objects.

You got it pretty right with the first two points.

The Pythonic way is to name internal methods starting with single underscore '_', this way all Python developers know that this method is there, but it's use is discouraged and won't use it. (Until they decide to do some monkey-patching, but you shouldn't care for this scenario.) For newbie developers you might want to mention explicitly about not using methods starting with underscore. Also, just don't provide public documentation for your "private" methods, use it for internal reference only.

You might want to take a look at "name mangling", but it's less common.

Hiding internals with object composition or methods like __getattribute__ and etc. is generally discouraged in Python.

You might want to look at source code of some popular libraries to see how they manage this, e.g. Django, Twisted, etc.

Nikita
  • 6,101
  • 2
  • 26
  • 44
  • Thank you very much for the answer. For the moment, I will go with not documenting internal methods. I hope to have the chance to read the source code of some popular library. – Giovanni Mar 16 '16 at 11:40
  • 1
    @Giovanni, don't forget to also start internal method names with single underscore `_` and it'll be a complete Pythonic solution to your issue. – Nikita Mar 16 '16 at 11:45
  • @Nikita: "Hiding internals with ... methods like `__getattribute__` and etc. is generally discouraged in Python". Oh really? I always thought that's exactly the purpose of this. Any reference for your statement? Quote from Python in a Nutshell: "However, you may override `__getattribute__` for special purposes, such as hiding inherited class attributes (e.g., methods) for your subclass’s instances." – aleneum Mar 16 '16 at 14:14
  • @aleneum, don't take it personally, you obviously provided a very detailed answer. I wrote "shouldn't use", not "can't use" and your citation states "may", not "should". So, ofc it can be used for this purposes, but it shouldn't, until the use case is really special, and usually you should think twice to decide if it's special enough. You get the drawback in performance, special cases with operator overloading, slots and even recursive calls (if not done properly) - this all doesn't worth it. And even then the attributes will still be accessible via `.__dict__[attr]`, so why bother? – Nikita Mar 17 '16 at 07:31
  • @aleneum, There's a convention, I wrote about, for such cases and usually it's enough and "Pythonic". Attribute control in general is not a Pythonic way of programming and usually should be avoided. `__getattribute__` is good for some "meta" programming, encapsulation, ORMs, etc. And if you really need some citation to back this up, see my next comment with citation from Mark Lutz's "Learning Python", which he states after demonstration of implementing "private" and "public" attribute access controls using `__getattribute__` among other methods... (I spent time to find it especially for you.) – Nikita Mar 17 '16 at 07:35
  • @aleneum, "*Now that I’ve gone to such great lengths to implement Private and Public attribute declarations for Python code, I must again remind you that it is not entirely Pythonic to add access controls to your classes like this. In fact, most Python programmers will probably find this example to be largely or totally irrelevant, apart from serving as a demonstration of decorators in action. Most large Python programs get by successfully without any such controls at all. That said, you might find this tool useful in limited scopes during development. ...*" – Nikita Mar 17 '16 at 07:40
  • @Nikita: Thanks for your informative comments. Your effort is much appreciated. I just think its good practise to back up claims, especially with strong wording like 'generally discouraged'. I am sure these comments and quotes can help programmers to decide if masking is worth the hassle. – aleneum Mar 17 '16 at 11:07