3

I'm working on building Cppcheck on AIX with the xlC compiler (see previous question). Checker classes all derive from a Check class, whose constructor registers each object in a global list:

check.h

class Check {
public:
    Check() {
        instances().push_back(this);
        instances().sort();
    }
    static std::list<Check *> &instances();
    virtual std::string name() const = 0;
private:
    bool operator<(const Check *other) const {
        return (name() < other->name());
    }
};

checkbufferoverrun.h

class CheckBufferOverrun: public Check {
public:
    // ...
    std::string name() const {
        return "Bounds checking";
    }
};

The problem I appear to be having is with the instances().sort() call. sort() will call Check::operator<() which calls Check::name() on each pointer in the static instances() list, but the Check instance that was just added to the list has not yet had its constructor fully run (because it's still inside Check::Check()). Therefore, it should be undefined behaviour to call ->name() on such a pointer before the CheckBufferOverrun constructor has completed.

Is this really undefined behaviour, or am I missing a subtlety here?

Note that I don't think the call to sort() is strictly required, but the effect is that Cppcheck runs all its checkers in a deterministic order. This only affects the output in the order in which errors are detected, which causes causes some test cases to fail because they're expecting the output in a particular order.

Update: The question as above still (mostly) stands. However, I think the real reason why the call to sort() in the constructor wasn't causing problems (ie. crashing by calling a pure virtual function) is that the Check::operator<(const Check *) is never actually called by sort()! Rather, sort() appears to compare the pointers instead. This happens in both g++ and xlC, indicating a problem with the Cppcheck code itself.

Community
  • 1
  • 1
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 1
    Add a string "name" parameter to the `Check` constructor, and have descendants provide a value at construction time. Make `name()` be non-virtual and return the value passed in by the constructor. The rest of the `Check` constructor shown here can remain as-is. – Rob Kennedy Feb 01 '11 at 22:02
  • @Rob Kennedy: That's a good solution, I'll propose that to the maintainers. – Greg Hewgill Feb 01 '11 at 22:19
  • Yes, and think about whether it would be worth using a set instead of a list for the instances - that could be faster than explicitly calling sort after each insert. BTW, your operator< will never get called by sort because it compares Check to Check*. – Axel Feb 01 '11 at 23:49

4 Answers4

4

Yes, it's undefined. The standard specifically says so in 10.4/6

Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call (10.3) to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined.

Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
2

It is true that calling a pure virtual function from a constructor is always an undefined behaviour.

The virtual pointer can not be assumed to be set until the constructor has run completely (closing "}"), and hence any call to a virtual function (or pure virtual function) has to be setup at the time of compilation itself (statically bound call).

Now, if the virtual function is pure virtual function, the compiler will generally insert its own implementation for such pure virtual function, the default behavior of which is to generate a segmentation fault. The Standard does not dictate what should be the implementation of a pure virtual function, but most of C++ compilers adopt aforesaid style.

If your code is not causing any runtime mischief demeanour, then it is not getting called in the said call sequence. If you could post the implementation code for below 2 functions

instances().push_back(this);
instances().sort();

then maybe it will help to see what's going on.

Viren
  • 2,161
  • 22
  • 27
  • The filenames in the question link through to the actual full code on Github. – Greg Hewgill Feb 01 '11 at 21:51
  • Sorry, I didn't know that was a link. I'm looking at Check.h file, but I don't see any sort() function declared in that header. :( – Viren Feb 01 '11 at 22:01
  • SORRY. My bad. it is a function of list. – Viren Feb 01 '11 at 22:03
  • 1
    No compiler's behaviour is to "generate a segmentation fault". What nonsense is this? – Lightness Races in Orbit Feb 08 '12 at 00:06
  • this is what i had seen when using ms vc++ ide back in my symbian days. also, for a more logical explanation, please see this: http://stackoverflow.com/questions/99552/where-do-pure-virtual-function-call-crashes-come-from That question is not exactly related to this answer, but invoking a pure virtual function would precipitate in a seg fault. – Viren Feb 08 '12 at 21:33
0

As long as object construction isn't finished, a pure virtual function may not be called. However, if it's declared pure virtual in a base class A, then defined in B (derived from A), the constructor of C (derived from B) may call it, since B's construction is complete.

In your case, use a static constructor instead:

class check {
private Check () { ... }
public:
    static Check* createInstance() {
        Check* check = new Check();
        instances().push_back(check);
        instances().sort();
    }
...
}
Axel
  • 13,939
  • 5
  • 50
  • 79
  • You can't create an instance of an ABC, so the this statement should not even compile: Check* check = new Check(); – Viren Feb 01 '11 at 22:00
  • Could you elaborate on that? I'm not sure I understand what you mean (or you don't understand the example). Apart from that there's a colon missing in my code, that is. – Axel Feb 01 '11 at 23:35
  • Oh, I see... You were referring to the Check class, not to ABC from the example I gave. Yes, Check cannot be instantiated - I overlooked Check itself is the base class declaring the pure virtual function. – Axel Feb 01 '11 at 23:41
0

I think your real problem is that you've conflated two things: the Checker base class, and some mechanism for registering (derived) instances of Check.

Among other things, this isn't particularly robust: I may want to use your Checker classes, but I may want to register them differently.

Maybe you could do something like this: Checker get a protected ctor (it's abstract anyway, and so only derived classes ought to be calling the Checker ctor).

Derived classes also have protected ctors, and a public static method (the "named constructor pattern") to create instances. That creating method news up a Checker subclass, and them passes it (fully created at this point) to a CheckerRegister class (which is also abstract, so users can implemented their own if need be).

You use whatever singleton pattern, or dependency injection mechanism, that you prefer, to instantiate a Checkerregister and make it available to Checker subclasses.

One simple way to do this would be to have a getCheckerRegister static method on Checker.

So a Checker subclass might look like this:

class CheckBufferOverrun: public Check { protected: CheckBufferOverrun : Check("Bounds checking") { // since every derived has a name, why not just pass it as an arg? } public: CheckBufferOverrun makeCheckBufferOverrun() { CheckBufferOverrun that = new CheckBufferOverrun();

   // get the singleton, pass it something fully constructed
   Checker.getCheckerRegister.register(that) ;
   return that;
}

If it looks like this will end up being a lot of boilerplate code, write a template. If you worry that because each template instance in C++ is a real and unique class, write a non-templated base class that will register any Checker-derived.

tpdi
  • 34,554
  • 11
  • 80
  • 120
  • Yes, the registration mechanism used in Cppcheck appears to be an attempt to minimise the amount of "connector" code needed to implement a new subclass of `Check` and have it used by the main checking loop. In theory, all one needs to do is declare and implement a derived class from `Check`, create a global static instance, and the rest of the program will start using it. I probably wouldn't have chosen to do it this way. – Greg Hewgill Feb 01 '11 at 22:17