Data Validation in a Constructor?

Question

I a beginner to OO-programming, and this question is about handling data validation in a particular design. I have read this thread about a similar problem, and I like the second solution, which recommends re-designing the constructor interface such that invalid values are impossible, but i wonder whether this idea is also applicable to my case. So I have outlined my thoughts below and would like to know if people agree with my ideas.

I have a Die class, which is constructed by passing a vector of probabilities. All the probabilities must be non-negative, and must add up to 1.0.

class Die
{
public:
    Die(/* Probability vector. */);
    int Roll();
}

I can think of several options to implement data validation:

The Die constructor is protected and accepts an std::vector<double>. Then have a static public method that performs the data validation, and returns a pointer to a new Die object if this is successful, and a NULL otherwise.
Create a ProbabilityVector class, which wraps around an std::vector<double>, and implement the data validation there (with the protected constructor + public static method above). Then the Die constructor accepts a (pointer to a) ProbabilityVector.
Implement a separate ProbabilityVectorValidation class, with a static method on an std::vector<double> returning a bool. Then, when necessary, use this method before passing an std::vector<double> to the Die constructor.

The disadvantage of (1) is that the validation is carried out every time a Die is created, even though this may not be necessary. (2) gets around this problem, but I'm not sure if the protected constructor + static public approach is elegant, because it involves raw pointers. (3) is looking like an attractive option, but seems less robust than (1) and (2)?

It is harder to tell which one is better based on your description. I suggest you just pick one and implement it, then refine your design while implementing it — Matt, Apr 08 '14 at 17:48
I'm going to go ahead and say I always go with 1, which is the simplest (KISS). The biggest problem I see is you trying to get a bunch of doubles to sum to 1 (protip, they (almost) never will). — IdeaHat, Apr 08 '14 at 17:49
@MadScienceDreams Nice protip. In practice it's enough to make sure they add up to between 0.99 and 1.01. Then I can normalize them. — MGA, Apr 08 '14 at 17:51
(why does 2 require raw pointers? As 3 is not required, I don't see how this would be any different than putting in the comment for the argument "The sum must be ~=1.0") — IdeaHat, Apr 08 '14 at 17:52
@mga Um...if I have a set S of N doubles, if sigma(S)=1.01, sigma(S/1.01) still won't always equal 1 (though it will within epsilon). — IdeaHat, Apr 08 '14 at 17:55
@MadScienceDreams Passing a unique_ptr does not work in this case (because if you create it inside the static function and then pass it by value, it goes out of scope before it's copied, and a mess ensues). Maybe passing in a uniqe_ptr by reference would work though; I'll try that out. — MGA, Apr 08 '14 at 17:56
I assume that a die would have equal probability per face unless its a lopsided die? If its equal perhaps a derived die type is in order, so there is no need to pass a vector, its just known and always correct? Even if its not a symmetrical die, it feels like a new type of die should be created for it. Anyway, you could create a ProbabilityVector class which can report/make sure the probs are valid and then you only instantiate the die when its valid? Else you have to start messing around with checking NULL's, private members in die to report if valid or not, etc — Chris, Apr 08 '14 at 17:56
@Chris The die will be lopsided :) The idea of the vector is to specify how it is so. — MGA, Apr 08 '14 at 17:57
@mga You could use a unique_ptr by either passing by reference, or `std::move`ing it. — IdeaHat, Apr 08 '14 at 17:59
I would suggest (1) but with a slight change: instead of returning null on failure, throw an exception. This way you can return your Die object by-value and not have to drag in pointer semantics. A free-standing function like `Die make_die(std::vector);` would probably do the trick just fine. — bstamour, Apr 08 '14 at 18:01
@mga also I highly suggest looking into [boost math](http://www.boost.org/doc/libs/1_54_0/libs/math/doc/html/index.html) [tutorial link](http://stackoverflow.com/questions/9951620/sampling-from-a-discrete-probability-distribution-in-c) — IdeaHat, Apr 08 '14 at 18:03
@bstamour There is no reason not to just build it into the constructor if we use exceptions. I mean, you may be able to handle the exceptions to do clean up, but IMHO if you have an exception and its not a fatal error than you are using exceptions wrong. — IdeaHat, Apr 08 '14 at 18:11
@MadScienceDreams Well, this avoids the issue about re-validating the input data though. The factory function can check the additivity of the vector and then defer it to a private constructor (make the factory a friend). On the topic of exceptions: if your objects must preserve additivity in order to be considered "valid" then an exception is absolutely appropriate. Sub or super-additive dies are the exception, not the norm. — bstamour, Apr 08 '14 at 18:21
Alternatively, within the constructor you could normalize the input vector by dividing each element by the sum of all elements. — Edward, Apr 08 '14 at 18:32

Data Validation in a Constructor?

0 Answers0