1

I would like to get some tips on the Pythonic way to validate the arguments when creating an instance of a class. I have a hard time understanding the proper usage of the __new__ method and maybe this is one of its usages? Say for example that i have class Test that takes in two arguments a and b. If, for example, i want to ensure that both must be integers and that b must be greater than a, i could do as follows:

class Test:
    def __init__(self, a, b):
        if not (isinstance(a,int) and isinstance(b,int)):
            raise Exception("bla bla error 1")
        if not b > a:
            raise Exception("bla bla error 2")

        self.a = a
        self.b = b
        #.....

or, i could do as follows:

def validate_test_input(a,b):
    if not (isinstance(a, int) and isinstance(b, int)):
        raise Exception("bla bla error 1")
    if not b > a:
        raise Exception("bla bla error 2")

class Test:
    def __init__(self, a, b):
        validate_test_input(a,b)
        self.a = a
        self.b = b
        #.....

What would you do? Is there any convention on data validation ? Should dunder new method be used here? If , so please show an example.

JustANoob
  • 580
  • 1
  • 4
  • 19
  • ``__new__`` exists for object construction, it's not needed for parameter validation. ``__init__`` is fine to validate its parameters. – MisterMiyagi Feb 04 '20 at 15:54
  • 1
    Everything you have posted is a pythonic way. I'd try to avoid overriding `__new__`, it's magic should be used intentionally, if there's no options. If you interested, here's a link https://www.code-learner.com/how-to-use-python-__new__-method-example/ where `__new__` can help to stop object instantiation if conditions not met. But I'd never did things this way. – mrEvgenX Feb 04 '20 at 15:54
  • 1
    You already have a technically perfect answer from DeepSpace so there's not much to add to it - expect that, most often, type-checking is rather unpythonic. I'm not saying it should never ever happen - there _are_ cases where it makes senses - but it should really really be restricted to those (few) cases. – bruno desthuilliers Feb 04 '20 at 16:04

3 Answers3

5

First snippet is almost perfectly fine. Unless this logic is reused in several places in your code base I would avoid the second snippet because it decouples the logic from the class.

I would just do some small semantic changes.

  • Raise proper exception types, ie TypeError and ValueError

  • Rephrase the conditions to be more readable (you may disagree as this is quite subjective)

  • Of course provide a useful text instead of "bla bla", but I trust you with that one ;)

    class Test:
        def __init__(self, a, b):
            if not isinstance(a, int) or not isinstance(b, int):
                raise TypeError("bla bla error 1")
            if a <= b:
                raise ValueError("bla bla error 2")
    
            self.a = a
            self.b = b
            #.....
    

Some may find the original if not (isinstance(a, int) and isinstance(b, int)) to be more readable than what I suggested and I will not disagree. Same goes for if a <= b:. It depends if you prefer to stress the condition you want to be true or the condition you want to be false.

In this case, since we are raising an exception I prefer to stress the condition we want to be false.

DeepSpace
  • 78,697
  • 11
  • 109
  • 154
  • 4
    Since the first check is for type, it should raise a ``TypeError``. – MisterMiyagi Feb 04 '20 at 15:55
  • 1
    I indeed have to disagree on the "Rephrase the conditions to be more readable" part ;-) – bruno desthuilliers Feb 04 '20 at 15:58
  • The reason im asking if its pythonic is because i have heard that __init__ should be as concrete as posssible, i.e. just show what attributes it has and so on. I will have more arguments and more validations, would you still do that in the __init__ ? – JustANoob Feb 04 '20 at 16:02
  • @JustANoob I think so, yes. If you have a lot of arguments that all should be the same type you can do all the checks in a single line, ie `if not all(isinstance(arg, int) for arg in args)` – DeepSpace Feb 04 '20 at 16:04
2

If this code is at development, I would maybe do that, which is not very different from your code:

class Test:
    def __init__(self, a, b):
        assert isinstance(a,int) and isinstance(b,int), "bla bla error 1"
        assert b > a, "bla bla error 2"

        self.a = a
        self.b = b
        #.....

And if I need this control when I will release that code (for example, if it is a library) I would convert asserts to raise, then raise a TypeError and a ValueError:

class Test:
    def __init__(self, a, b):
        if not (isinstance(a,int) and isinstance(b,int)):
            raise TypeError("bla bla error 1")
        if not b > a:
            raise ValueError("bla bla error 2")

        self.a = a
        self.b = b
        #.....

So your code is the true way to go.

In the case of __new__ magic method, today I found a good example in builtin turtle library. In the definition of Vec2D class:

class Vec2D(tuple):
    """A 2 dimensional vector class, used as a helper class
    for implementing turtle graphics.
    May be useful for turtle graphics programs also.
    Derived from tuple, so a vector is a tuple!

    Provides (for a, b vectors, k number):
       a+b vector addition
       a-b vector subtraction
       a*b inner product
       k*a and a*k multiplication with scalar
       |a| absolute value of a
       a.rotate(angle) rotation
    """
    def __new__(cls, x, y):
        return tuple.__new__(cls, (x, y))

    ...

As you know, tuple takes an argument which is iterable. Developers of this module probably wanted to change it, so they defined __new__ as (cls, x, y), and then they called tuple.__new__ as (cls, (x, y)). The cls in here is the class which is instanced. For more information, look at here: Calling __new__ when making a subclass of tuple

Ekrem Dinçel
  • 1,053
  • 6
  • 17
1

The way you do it in the first code snippet is okay, but as python is quite versatile with what you can pass to a function or a class, there is much more to check if you go that way that mere argument types.

Duck typing makes checking argument types less reliable: a provided object could comply with what a function or a constructor need but not derive from some known class.

You could also want to check arguments names or such things.

The most common style is rather not testing inputs and consider the caller is safe (in internal code) and use some dedicated module like zope.interface for interactions with external world.

To make things lighter on a syntaxic POV interface checking is also typically done using decorators.

PS: the '__new__' method is about metaclass and used to solve issue related to objects allocation. Definitely unrelated to interface checks.

kriss
  • 23,497
  • 17
  • 97
  • 116