Well-known solution for avoiding the slowness of dynamic_cast?

Question

I needed run-time polymorphism, so I used dynamic_cast.
But now I had two problems -- dynamic_cast was extremely slow! (Scroll down for benchmark.)

Long story short, I ended up solving the problem this way, using static_cast:

struct Base
{
    virtual ~Base() { }
    virtual int type_id() const = 0;

    template<class T>
    T *as()
    { return this->type_id() == T::ID ? static_cast<T *>(this) : 0; }

    template<class T>
    T const *as() const
    { return this->type_id() == T::ID ? static_cast<T const *>(this) : 0; }
};

struct Derived : public Base
{
    enum { ID = __COUNTER__ };  // warning: can cause ABI incompatibility
    int type_id() const { return ID; }
};

int main()
{
    Base const &value = Derived();
    Derived const *p = value.as<Derived>();  // "static" dynamic_cast
}

But I'm surely not the first person to run into this problem, so I thought it might be worth asking:

Instead of coming up with a home-baked solution like this, is there a well-known pattern/library I can use for solving this problem in the future?

Sample benchmark

To get a feel for what I'm talking about, try the code below -- dynamic_cast was approximately 15 times slower than a mere virtual call on my machine (110 ms. against 1620 ms. with the code below):

#include <cstdio>
#include <ctime>

struct Base { virtual unsigned vcalc(unsigned i) const { return i * i + 1; } };
struct Derived1 : public Base 
{ unsigned vcalc(unsigned i) const { return i * i + 2; } };
struct Derived2 : public Derived1
{ unsigned vcalc(unsigned i) const { return i * i + 3; } };

int main()
{
    Base const &foo = Derived2();
    size_t const COUNT = 50000000;
    {
        clock_t start = clock();
        unsigned n = 0;
        for (size_t i = 0; i < COUNT; i++)
            n = foo.vcalc(n);
        printf("virtual call: %d ms (result: %u)\n",
            (int)((clock() - start) * 1000 / CLOCKS_PER_SEC), n);
        fflush(stdout);
    }
    {
        clock_t start = clock();
        unsigned n = 0;
        for (size_t i = 0; i < COUNT; i++)
            n = dynamic_cast<Derived1 const &>(foo).vcalc(n);
        printf("virtual call after dynamic_cast: %d ms (result: %u)\n",
            (int)((clock() - start) * 1000 / CLOCKS_PER_SEC), n);
        fflush(stdout);
    }
    return 0;
}

When I simply remove the word virtual and change dynamic_cast to static_cast, I get a 79 ms running time -- so a virtual call is only slower than a static call by ~25%!

Instead of the dynamic_cast operator, use the typeid operator, or, better, a virtual function call. Such operator may take a time noticeably longer than a virtual function call, and longer also than the typeid operator. — Tutankhamen, Sep 10 '12 at 01:31
*"instead of dynamic_cast, use a virtual function call"* Uhm, did you read the question by any chance? — user541686, Sep 10 '12 at 01:33
The obvious question would be "why do you need `dynamic_cast`"? In most cases, you can eliminate casts altogether with a better design, but how this is done depends on the specific problem you are trying to solve. — casablanca, Sep 10 '12 at 01:37
@casablanca: Sure, though that's completely orthogonal to my question here. — user541686, Sep 10 '12 at 01:38
@Mehrdad: It may be, but it is unclear from your question if that is the case. I still think that if you are calling `dynamic_cast` so many times that it actually affects performance, you may need to revisit your design. — casablanca, Sep 10 '12 at 01:42
*Well known solution for avoiding the slowness of `dynamic_cast`*? Not using it. Runtime polymorphism and `dynamic_cast` are almost opposite concepts: polymorphism is being able to get a reference to an object and use it without caring about the type, `dynamic_cast` is caring about the type because you cannot get to a feature polymorphically. The best advice has already been given by @casablanca: revisit the design. If you are using `dynamic_cast`, the goal is not to get it to be faster, but rather remove it from the design. — David Rodríguez - dribeas, Sep 10 '12 at 03:41
@DavidRodríguez-dribeas: Good point, I hadn't thought of it that way... indeed it looks like the opposite of polymorphism, now that I think about it that way. :) Thanks for pointing it out. — user541686, Sep 10 '12 at 03:48
Have a look at [LLVM's `isa`](http://stackoverflow.com/q/6038330/395760), `dyn_cast`, etc. which place restrictions on the class hierarchy and take some manual work for each class, but is ludicrously efficient (just loading a member and comparing it to a constant IIRC). It may be overkill, or the restrictions may be too severe, but it's quite interesting nevertheless. — , Sep 10 '12 at 13:32

score 3 · Accepted Answer · answered Sep 10 '12 at 01:48

3

Most uses of dynamic_cast can be replaced with double dispatch (aka. visitor pattern). That would amount to two virtual calls, which by your benchmark is still 7.5 times faster than dynamic_cast.

answered Sep 10 '12 at 01:48

casablanca

69,683
7
133
150

+1, though what about when you can't modify the source code of the class you're visiting? Do you make a wrapper for every single class? – user541686 Sep 10 '12 at 01:54
@Mehrdad: You can still wrap/extend the class to add visitor support. – casablanca Sep 10 '12 at 02:03
Hmm, but extending isn't always possible (or a good idea!), and wrapping makes you have to change all your `foo.bar` into `wrapper.foo.bar`... is that really a good idea? – user541686 Sep 10 '12 at 02:05
@Mehrdad: It seems you're dealing with an external library (since you mentioned you can't modify it), so if your entire code base is coupled to that library, that's already a problem. Wrapping it seems like a good idea anyway, because it would shield you from future changes to that library. – casablanca Sep 10 '12 at 02:10
Well, I'm not dealing with an external library right *now*, since you can see in the question that I was able to modify the class here. :) But yes, I'm asking what I could do if I *were* in that situation in the future, since that was the intent of the question. So when I'm in that situation -- how does creating a wrapper "shield" me from changes? Then I'm just going to have to update the wrapper after any changes, and any nontrivial change (new or deprecated functionality) is going to require me to change the rest of the code anyway... seems kind of weird to make a wrapper for every library. – user541686 Sep 10 '12 at 02:15
@Mehrdad: Sorry, I didn't mean "wrapper" in the raw sense. In general, you would aim to reduce coupling with any external library, so you would create your own well-defined interface for what your application requires, and the implementation of this interface would talk to the library. If you do this right, the rest of your code will be immune to changes in the library. – casablanca Sep 10 '12 at 02:42

justin · Answer 2 · 2012-09-10T01:50:26.750

2

You may be interested in this constant time implementation: http://www.stroustrup.com/isorc2008.pdf

Also note that many upcasts may be simplified under specific constraints -- for example, if you do not use multiple inheritance, use only shallow inheritance, or otherwise guarantee that the type will have no common ancestors, some evaluations can be short circuited and an exhaustive evaluation (as provided by dynamic_cast) need not be performed.

As usual, profile your implementation against your vendor's implementation for your given use case and actual class hierarchies.

edited Sep 10 '12 at 01:50

answered Sep 10 '12 at 01:39

justin

104,054
14
179
226

1

Oh dang, that looks like an intense read! I'll look at it as soon as I can, thanks! +1 – user541686 Sep 10 '12 at 01:40
1

@Mehrdad i remember enjoying it when i read it a few years back. you're welcome – justin Sep 10 '12 at 01:41
1

I want to read it, but I've been reading programming stuff all day. This one will have to wait until tomorrow, but I'm sure I'll enjoy it. – chris Sep 10 '12 at 01:45

Well-known solution for avoiding the slowness of dynamic_cast?

Sample benchmark

2 Answers2

Linked