2

I'm new to OOP in Python and lets assume I have a class that does a simple calculation:

class Calc: 
    def __init__(self, n1, n2): 
        self.n1 = n1 
        self.n2 = n2 

    def sum(self):
        return self.n1 + self.n2

In this simplified example, what is the best way to validate the attributes of the class? For example, say if I expect a float for n1 and n2 such that I define my constructor as:

self.n1 = float(n1) 
self.n2 = float(n2) 

If n1 or n2 was None I would get an Attribute Error as NoneType can't be a float - for some reason, it feels 'wrong' for we to have logic in the constructor of Calc class to catch this.

Would I have some sort of validation logic before ever creating the instance of the class to catch this upstream?

Is there a way for me to use some technique to validate on the fly like perhaps decorators or property annotations?

Any advice is appreciated

Bob
  • 295
  • 5
  • 19
  • 2
    see https://stackoverflow.com/questions/2825452/correct-approach-to-validate-attributes-of-an-instance-of-class – balderman Mar 27 '20 at 14:27
  • Ahh thank you @balderman I knew it was something to do with `properties` – Bob Mar 27 '20 at 14:30
  • 1
    What behaviour do you desire? Do you just want to tell clients of your code what types you expect, and if they mess up it's their fault? Do you want to coerce what you get to the correct type? Do you want to silently pick a default if a wrong value is given (as your notes on ``None`` imply)? Do you want verification only on parameters passed to the constructor, or on any assignment of attributes? Do you want a static or dynamic verification? – MisterMiyagi Mar 27 '20 at 14:56
  • All valid questions, I have enough information from the answers to go forward :) – Bob Mar 27 '20 at 14:57

3 Answers3

3

This depends on where you get your data from and how simple you want your code to be. If you want this class to absolutely verify input data you can't trust, e.g. because it comes directly from user input, then you do explicit validation:

class Calc: 
    def __init__(self, n1, n2): 
        if not all(isinstance(n, float) for n in (n1, n2)):
            raise TypeError('All arguments are required to be floats')

        self.n1 = n1 
        self.n2 = n2 

The next level down from this would be debugging assertions:

class Calc: 
    def __init__(self, n1, n2): 
        assert all(isinstance(n, float) for n in (n1, n2)), 'Float arguments required'

        self.n1 = n1 
        self.n2 = n2 

assert statements can be disabled for performance gain, so should not be relied upon as actual validation. However, if your data is passing through a validation layer before this and you generally expect your arguments to be floats, then this is nice and concise. It also doubles as pretty decent self-documentation.

The next step after this are type annotations:

class Calc: 
    def __init__(self, n1: float, n2: float): 
        self.n1 = n1 
        self.n2 = n2 

This is even more readable and self-documenting, but never does anything at runtime. This depends on static type checkers to analyse your code and point out obvious mistakes, such as:

Calc(input(), input())

Such problems can be caught and pointed out to you by a static type checker (because input is known to return strings, which doesn't fit the type hint), and they're integrated in most modern IDEs.

Which strategy is best for you and your situation, you decide. Varying combinations of all three approaches are used in every day code.

deceze
  • 510,633
  • 85
  • 743
  • 889
  • Thank you for this information, I will look into a combination of `explicit validation` and `type annotations` - performance is the top priority here, but ideally I would want valid data to continue and incorrect data to maybe be added to a CSV report or something – Bob Mar 27 '20 at 14:50
  • It's a tradeoff between unit testing and static analysing to find problems before runtime, and actual runtime testing to ensure validity. It really depends on what layers your data passes through before reaching this part of the code whether you need explicit checks or not. – deceze Mar 27 '20 at 14:52
  • I might create a small validation with `assertions` before it gets to this stage, as I can sacrifice performance at that stage, but at this point I need it to be as fast as possible – Bob Mar 27 '20 at 14:55
  • How would I overwrite the attribute if it didn't satisfy the condition rather then raising an error? – Bob Mar 27 '20 at 14:59
  • That depends on you. Do you want to require your class to be passed the correct inputs, or do you want to try to coerce *any* value into… something? The latter always requires runtime computation, which is detrimental to performance. It also makes the inputs wishy washy and possibly hard to control. It's easiest to take typing seriously and ensure data is of the correct and expected type as early as possible (e.g. at the input/output boundaries), then simply rely on that throughout the rest of the application. – deceze Mar 27 '20 at 15:03
  • 2
    I require the class to be passed the correct inputs - I guess to answer my own question to a certain extent, I should be focusing more on validation block before getting anywhere near an instance of `Calc` – Bob Mar 27 '20 at 15:05
2

Validating types is a fight you cannot win. It comes with serious overhead and will still not protect you against errors – if you receive wrong types, all you can do is fail.

Default to having types statically verifiable by using type hints:

class Calc: 
    def __init__(self, n1: float, n2: float): 
        self.n1 = n1 
        self.n2 = n2 

    def sum(self):
        return self.n1 + self.n2

This allows IDEs and type checkers, e.g. mypy, to validate type correctness statically. It has no runtime overhead, and can be checked as part of continuous integration and similar.

For critical parts where corrupted state is not acceptable, use assertions to verify types.

class Calc: 
    def __init__(self, n1: float, n2: float):
        assert isinstance(n1, float)
        assert isinstance(n2, float)
        self.n1 = n1 
        self.n2 = n2 

    def sum(self):
        return self.n1 + self.n2

Assertions do have runtime overhead, but they can be switched off completely once (type) correctness has been verified.

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
-1

Just validate the values before initiating

class Calc: 
    def validate(self,n1,n2):
        if not isinstance(n1, float) or not isinstance(n2, float):
            return False
        return True

    def __init__(self, n1, n2): 
        if self.validate(n1,n2):
            self.n1 = n1 
            self.n2 = n2 

    def sum(self):
        return self.n1 + self.n2
mad_
  • 8,121
  • 2
  • 25
  • 40
  • Thanks for answer - I will probably use your logic of `if not isinstance` with the `properties` info provided from @balderman – Bob Mar 27 '20 at 14:37
  • 1
    Returning `True` or `False` is not a good idea, it puts the burden of checking the validity of the data on each piece of code calling it, that would have to check the return value. It would be much better to raise an exception. – Thierry Lathuille Mar 27 '20 at 14:39
  • 1
    Silently not doing anything in the constructor is probably not a good idea, that should definitely raise an exception. – deceze Mar 27 '20 at 14:44