8

I have a question, and my decision in choosing Python as a possible language for a bigger project depends on the answer - which I cannot come up with myself:

We all know that Python has no real object encapsulation, so there is nothing like "private" properties of an object. Regarding this issue, Guido van Rossum says that one can access hidden parts of a foreign object without being "allowed" to, with "we are all adults", "just don't do it". I can live perfectly well with that, as long as the software I write is in my own hand, so I am responsible for my own errors and just can try to avoid such things.

BUT - and here comes my question: What if I provide a plugin framework with some plugins that have some extension points, and many of the plugins are by OTHER people, maybe ones that I cannot trust completely.

How do I prevent exposing internals of my framework from being accessed by a plugin?

Is there a way to achieve this, or is the only way to use Python having confidence that no one will abuse my API?

Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
nerdoc
  • 1,044
  • 10
  • 28
  • 2
    Some google search terms to help you out: [Sandbox python](http://www.google.com/search?q=sandboxed+python). – Mark Hildreth Oct 02 '13 at 19:35
  • 2
    I don't really see how this is a security issue. Who is being guarded from what? – Fred Foo Oct 02 '13 at 19:40
  • 2
    I really have to admit that I slightly dislike the tone of *"We all know that Python has no real object encapsulation"* :) – Erik Kaplun Oct 02 '13 at 19:45
  • 3
    @ErikAllik: at least we all know. C++ and Java programmers are still fooling themselves. – Fred Foo Oct 02 '13 at 19:53
  • What would be an example of "Abusing your API"? – SingleNegationElimination Oct 02 '13 at 22:35
  • how about `api._permissions['erik'].add('delete', '*')`? – Erik Kaplun Oct 02 '13 at 22:40
  • @dequestarmappartialsetattr: exactly what I meant, yes. Changing "internal" members of an object that are meant to be "private" in security terms, like _ or __ fields. – nerdoc Oct 03 '13 at 20:24
  • @ErikAllik: Python HAS no object encapsulation in the sense of keeping private members really private... I love Python (as far as I know it by now) - I must admit that I am no proffessional programmer (else I wouldn't have to ask questions like this one ;-) ) – nerdoc Oct 03 '13 at 20:27
  • @DrOetker: so do C++, Java and C# keep private members "really private"? because I've demonstrated that they dont; so what is your definition or "really private"? – Erik Kaplun Oct 03 '13 at 21:12

1 Answers1

18

You should never really rely on private, public etc for security (as in "protection against malicious code and external threats"). They are meant as something to keep the programmer from shooting himself in the foot, not as a (computer) security measure. You can also easily access private member fields of C++ objects, as long as you bypass static compiler checks and go straight to the memory, but would you say that C++ lacks true encapsulation?

So you would never really use private or protected as a security measure against malicious plugins in C++ nor Java, and I assume C# as well.

Your best bet is to run your plugins in a separate process and expose the core API over IPC/RPC or even web service, or run them in a sandbox (as per what @MarkHildreth pointed out). Alternatively, you can set up a certification and signing process for your plugins so that you can review and filter out potentially malicious plugins before they even get distributed.

NOTE:

You can actually achieve true encapsulation using lexical closures:

def Foo(param):
    param = [param]  # because `nonlocal` was introduced only in 3.x
    class _Foo(object):
        @property
        def param(self):
            return param[0]
        @param.setter
        def param(self, val):
            param[0] = val
    return _Foo()

foo = Foo('bar')
print foo.param  # bar
foo.param = 'baz'
print foo.param  # baz
# no way to access `foo._param` or anything

...but even then, the value is actually still relatively easily accessible via reflection:

>>> foo.__class__.param.fget.__closure__[0].cell_contents[0] = 'hey'
>>> foo.param
'hey'

...and even if this weren't possible, we'd still be left with ctypes which allows direct memory access, bypassing any remaining cosmetic "restrictions":

import ctypes
arr = (ctypes.c_ubyte * 64).from_address(id(foo))

and now you can just assign to arr or read from it; although you'd have to work hard to traverse pointers from there down to the actual memory location where .param is stored, but it proves the point.

Community
  • 1
  • 1
Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
  • 3
    Well put. Not only is this answer itself very helpful, but those links are very interesting and useful as well. You made a very good distinction with encapsulation being more of a programming tool rather than a security measure. – Gray Oct 02 '13 at 19:48
  • 1
    +1. I too tend to lament Python's lack of real access control, but only because it makes it too easy for me to write crappy OOP in my own classes. Even with third party classes, learning the public API is as much effort as I feel usually inclined to put in, I can't say I've ever felt sufficiently motivated to poke into the internals to the extent where having access control over someone else's classes would have made any appreciable difference. – Crowman Oct 02 '13 at 21:44
  • @ErikAllik Thank you very much - this was the answer I was searching for. Didn't know that it's possible in Java to access private members. In C++ I thought so, but I still put it into the Voodoo drawer... – nerdoc Oct 03 '13 at 20:29
  • Hm. I think this is, despite being a good approach in a security manner, not viable for programming a snappy application - a framework which heavily depends on calling the APIs functions does slow down and does not scale very well when having all API calls via RPC, am I right? I mean e.g. exposing functions for displaying dozens/hundreds of GUI elements on the screen via RPC calls... – nerdoc Oct 03 '13 at 20:42
  • 1
    @DrOetker: the sandbox approach is not very good then either, I guess; so your best option is probably to just trust the code you've verified and certified/approved to be safe and not introduce any overhead in the form of a security layer. – Erik Kaplun Oct 03 '13 at 21:14
  • @ErikAllik Even if this was 2 years ago, I wanted to say thanks. Certification is the way to go here then. – nerdoc Nov 30 '15 at 05:05
  • @NerDoc: I appreciate you coming back here with this and sharing with us! Is there a more elaborate story behind this? – Erik Kaplun Dec 01 '15 at 01:44
  • 1
    Not yet. But to come. What I'm planning is a Open Source Electronic Medical Record for (firstly) small practicians. And I'll do this in Python - and it should have a plugin API to be extended by other companies e.g. ECG devices etc. Currently I am in the state of planning (and during to the lack of time (I'm a doc myself) this is ongoing a little delayed. ;-) – nerdoc Dec 10 '15 at 09:31