15

A pure function is a function similar to a Mathematical function, where there is no interaction with the "Real world" nor side-effects. From a more practical point of view, it means that a pure function can not:

  • Print or otherwise show a message
  • Be random
  • Depend on system time
  • Change global variables
  • And others

All this limitations make it easier to reason about pure functions than non-pure ones. The majority of the functions should then be pure so that the program can have less bugs.

In languages with a huge type-system like Haskell the reader can know right from the start if a function is or is not pure, making the successive reading easier.

In Python this information may be emulated by a @pure decorator put on top of the function. I would also like that decorator to actually do some validation work. My problem lies in the implementation of such a decorator.

Right now I simply look the source code of the function for buzzwords such as global or random or print and complains if it finds one of them.

import inspect

def pure(function):
    source = inspect.getsource(function)
    for non_pure_indicator in ('random', 'time', 'input', 'print', 'global'):
        if non_pure_indicator in source:
            raise ValueError("The function {} is not pure as it uses `{}`".format(
                function.__name__, non_pure_indicator))
    return function

However it feels like a weird hack, that may or may not work depending on your luck, could you please help me in writing a better decorator?

Community
  • 1
  • 1
Caridorc
  • 6,222
  • 2
  • 31
  • 46
  • 3
    You could `inspect.getsource` then `ast.parse` and walk the nodes check various things... but you'd be going against the reason the language exists - look at using the `abc` module if you want stuff, then `isinstance` checking where needs be... - python is **strongly** typed - not **statically** typed – Jon Clements Jul 22 '15 at 16:12
  • @JonClements dynamic languages do in fact perform less compile-time verification, but I think that is particular check would greatly enhance program organization and double check the programmers understanding of his own work. – Caridorc Jul 22 '15 at 16:14
  • 4
    Then use a statically typed language... :) You can either view it as a *bad* thing or a *good* thing... but it's the way it is – Jon Clements Jul 22 '15 at 16:15
  • You could perhaps rule out some obvious problems, but every nontrivial function calls dozens of methods, `__dunder__` methods, and other functions. Each and every one of those calls can do anything at all, from modifying virtually any object up to and including changing what functions will be called on the next line. An incomplete black list is the best you'll be able to do, but that can also be done statically by a linter, no need for run time validation. –  Jul 22 '15 at 16:18
  • @JonClements very few languages segregate pure and non-pure functions, I think Haskell would be a good choice, but I find it very difficult to learn as I find the type error messages very cryptic. – Caridorc Jul 22 '15 at 16:21
  • 1
    you can be entirely evil and just....[misplace a function's `__globals__`](http://bytes.com/topic/python/answers/43111-turn-globals-function), but not recommended. Honestly, just learn Haskell. – NightShadeQueen Jul 22 '15 at 16:51
  • You're going to want to do bytecode inspection, probably, rather than source code inspection. – kindall Jul 22 '15 at 17:55
  • 2
    This is impossible. Give up. – Veedrac Jul 22 '15 at 23:17
  • @Veedrac Post it as as an an andate answer if no better comes along I will accept it – Caridorc Jul 23 '15 at 09:00

2 Answers2

15

I kind of see where you are coming from but I don't think this can work. Let's take a simple example:

def add(a,b):
    return a + b

So this probably looks "pure" to you. But in Python the + here is an arbitrary function which can do anything, just depending on the bindings in force when it is called. So that a + b can have arbitrary side effects.

But it's even worse than that. Even if this is just doing standard integer + then there's more 'impure' stuff going on.

The + is creating a new object. Now if you are sure that only the caller has a reference to that new object then there is a sense in which you can think of this as a pure function. But you can't be sure that, during the creation process of that object, no reference to it leaked out.

For example:

class RegisteredNumber(int):

    numbers = []

    def __new__(cls,*args,**kwargs):
        self = int.__new__(cls,*args,**kwargs)
        self.numbers.append(self)
        return self

    def __add__(self,other):
        return RegisteredNumber(super().__add__(other))

c = RegisteredNumber(1) + 2

print(RegisteredNumber.numbers)

This will show that the supposedly pure add function has actually changed the state of the RegisteredNumber class. This is not a stupidly contrived example: in my production code base we have classes which track each created instance, for example, to allow access via key.

The notion of purity just doesn't make much sense in Python.

strubbly
  • 3,347
  • 3
  • 24
  • 36
  • 1
    I might word your final statement a little differently - the *notion* of purity can be considered just fine in Python, it's just that basically every nontrivial function is impure because there's no way to account for all the different inputs it could get and environments it could run under. – Ken Williams Jun 11 '21 at 16:32
1

(not an answer, but too long for a comment)

So if a function can return different values for the same set of arguments, it is not pure?

Remember that functions in Python are objects, so you want to check the purity of an object...

Take this example:

def foo(x):
    ret, foo.x = x*x+foo.x, foo.x+1
    return ret
foo.x=0

calling foo(3) repeatedly gives:

>>> foo(3)
9

>>> foo(3)
10

>>> foo(3)
11

...

Moreover, reading globals does not require to use the global statement, or the global() builtin inside your function. Global variables might change somewhere else, affecting the purity of your function.

All the above situation might be difficult to detect at runtime.

fferri
  • 18,285
  • 5
  • 46
  • 95
  • Interesting idea, but I can think of many functions which are not pure which might seem so over short timescales like getting the hour of the day, o/s version number, current git branch, etc. – wallyk Jul 22 '15 at 17:19