Circumventing RTTI on legacy code

Question

I have been looking for a way to get around the slowness of the dynamic cast type checking. Before you start saying I should redesign everything, let me inform you that the design was decided on 5 years ago. I can't fix all 400,000 lines of code that came after (I wish I could), but I can make some changes. I have run this little test on type identification:

#include <iostream>
#include <typeinfo>
#include <stdint.h>
#include <ctime>

using namespace std;

#define ADD_TYPE_ID \
    static intptr_t type() { return reinterpret_cast<intptr_t>(&type); }\
    virtual intptr_t getType() { return type(); }

struct Base
{
    ADD_TYPE_ID;
};

template <typename T>
struct Derived : public Base
{
    ADD_TYPE_ID;
};

int main()
{
    Base* b = new Derived<int>();
    cout << "Correct Type: " << (b->getType() == Derived<int>::type()) << endl; // true
    cout << "Template Type: " << (b->getType() == Derived<float>::type()) << endl; // false
    cout << "Base Type: " << (b->getType() == Base::type()) << endl; // false

    clock_t begin = clock();
    {
        for (size_t i = 0; i < 100000000; i++)
        {
            if (b->getType() == Derived<int>::type())
                Derived <int>* d = static_cast<Derived<int>*> (b);
        }
    }
    clock_t end = clock();
    double elapsed = double(end - begin) / CLOCKS_PER_SEC;

    cout << "Type elapsed: " << elapsed << endl;

    begin = clock();
    {
        for (size_t i = 0; i < 100000000; i++)
        {
            Derived<int>* d = dynamic_cast<Derived<int>*>(b);
            if (d);
        }
    }
    end = clock();
    elapsed = double(end - begin) / CLOCKS_PER_SEC;

    cout << "Type elapsed: " << elapsed << endl;

    begin = clock();
    {
        for (size_t i = 0; i < 100000000; i++)
        {
            Derived<int>* d = dynamic_cast<Derived<int>*>(b);
            if ( typeid(d) == typeid(Derived<int>*) )
                static_cast<Derived<int>*> (b);
        }
    }
    end = clock();
    elapsed = double(end - begin) / CLOCKS_PER_SEC;

    cout << "Type elapsed: " << elapsed << endl;

   return 0;
}

It seems that using the class id (first times solution above) would be the fastest way to do type-checking at runtime. Will this cause any problems with threading? Is there a better way to check for types at runtime (with not much re-factoring)?

Edit: Might I also add that this needs to work with the TI compilers, which currently only support up to '03

I have no idea what `if ( typeid(d) == typeid(Derived*) )` is intended to do. Besides, none of your tests seems to have side effects, they could all be dropped by the optimizer. — dyp, Aug 25 '14 at 23:19
@KerrekSB You mean the OP should quit the job, before starting to struggle with that legacy code? — πάντα ῥεῖ, Aug 25 '14 at 23:20
@πάνταῥεῖ: Well, if I were told I had to make 400k loc of slow legacy code fast, I would like to know my options if this fails... — Kerrek SB, Aug 25 '14 at 23:23
I don't think you need to invent your own type IDs. You should be able to just use `typeid(Base)` etc. Those are static already. (That's basically how `boost::any` does it.) — Kerrek SB, Aug 25 '14 at 23:24
@KerrekSB _"Well, if I were told I had to make 400k loc ..."_ I'd try to develop some tools to support refactoring (estimate efforts and success chances respectively) ;) ... — πάντα ῥεῖ, Aug 25 '14 at 23:24
As I thought, the second and third test seem to be dropped entirely by the optimizer (clang++, g++). The first one isn't, probably due to the virtual function call. — dyp, Aug 25 '14 at 23:26
Agreed with @dyp, the third variant seems pointless. You've already done the dynamic cast. — Kerrek SB, Aug 25 '14 at 23:37
@dyp: By adding `volatile`s in suitable places, you can get the loops to execute. — Kerrek SB, Aug 25 '14 at 23:37
I am not very familiar with the optimizer. Are there any good resources that you would recommend for brushing up on how it works? The system was designed to have a python "glue" but that eventually became c++, so there are generic types being passed between blocks. The block each cast them to more specific types. It is a mess of casting with too many "we just know"s in it. That career overflow is sounding better every day. — Cory-G, Aug 25 '14 at 23:52
It's hard to tell if it would suit your use case, but it's possible to [homebrew](https://github.com/phs/sauce/blob/ac11912/sauce/internal/type_id.h) RTTI by relying on the [one definition rule](http://stackoverflow.com/q/7670000/580412). — phs, Aug 26 '14 at 00:39

Kerrek SB · Accepted Answer · 2014-08-26T09:12:10.197

3

First off, note that there's a big difference between dynamic_cast and RTTI: The cast tells you whether you can treat a base object as some further derived, but not necessarily most-derived object. RTTI tells you the precise most-derived type. Naturally the former is more powerful and more expensive.

So then, there are two natural ways you can select on types if you have a polymorphic hierarchy. They're different; use the one that actually applies.

void method1(Base * p)
{
    if (Derived * q = dynamic_cast<Derived *>(p))
    {
        // use q
    }
}

void method2(Base * p)
{
    if (typeid(*p) == typeid(Derived))
    {
        auto * q = static_cast<Derived *>(p);

        // use q
    }
}

Note also that method 2 is not generally available if the base class is a virtual base. Neither method applies if your classes are not polymorphic.

In a quick test I found method 2 to be significantly faster than your manual ID-based solution, which in turn is faster than the dynamic cast solution (method 1).

edited Aug 26 '14 at 09:12

answered Aug 26 '14 at 00:10

Kerrek SB

464,522
92
875
1,084

I mostly said "Circumventing RTTI" because on the TI compiler and a few others, just having RTTI on slows everything down a lot. It would be nice to turn it off altogether. I think I might implementing a solution that is easily swapped out so I can test the whole systems speed with the different solutions. – Cory-G Aug 26 '14 at 15:57
Oh, never mind. I copied the example [here](http://www.cplusplus.com/reference/typeinfo/type_info/operator==/) and changed the stuff with Base and Derived forgetting your comment about the need for virtual. – Cory-G Sep 04 '14 at 18:48
Everything seems to be working. With this and many many other changes, I have been able to cut the runtime almost in half! Thanks for your help. – Cory-G Sep 04 '14 at 19:20
@CoryB: Glad it was useful :-) – Kerrek SB Sep 04 '14 at 19:26
*The cast tells you whether you can treat a base object as some further derived* didn't you mean an (unknown) object pointed to by a `base*`? – Walter Oct 12 '14 at 14:57
@Walter: What's the difference? An object pointed to by a `base *` is necessarily an object of type `base`. – Kerrek SB Oct 12 '14 at 17:28

score 1 · Answer 2 · answered Aug 25 '14 at 23:26

1

How about comparing the classes' virtual function tables?

Quick and dirty proof of concept:

void* instance_vtbl(void* c)
{
    return *(void**)c;
}

template<typename C>
void* class_vtbl()
{
    static C c;
    return instance_vtbl(&c);
}

// ...

begin = clock();
{
    for (size_t i = 0; i < 100000000; i++)
    {
        if (instance_vtbl(b) == class_vtbl<Derived<int>>())
            Derived <int>* d = static_cast<Derived<int>*> (b);
    }
}
end = clock();
elapsed = double(end - begin) / CLOCKS_PER_SEC;

cout << "Type elapsed: " << elapsed << endl;

With Visual C++'s /Ox switch, this appears 3x faster than the type/getType trick.

answered Aug 25 '14 at 23:26

Vladimir Panteleev

24,651
6
70
114

1

The reason why this is faster is that it doesn't use a virtual function call, I suppose. The tests still have no side effects and cannot be used to compare the different approaches. This approach is not a portable solution (strictly speaking, probably undefined behaviour due to aliasing). I guess it indirectly relies on RTTI, since otherwise the vtable of two classes could be identical (=> is it reliable?) – dyp Aug 25 '14 at 23:30
1

This has problems with any types that aren't default constructable as well as being UB. (Also fails for types without vtables, but meh) – Mooing Duck Aug 26 '14 at 00:30

score 0 · Answer 3 · answered Aug 25 '14 at 23:34

Given this type of code

class A {
};

class B : public A {
}

A * a;
B * b = dynamic_cast<B*> (a);
if( b != 0 ) // do something B specific

The polymorphic (right?) way to fix it is something like this

class A {
public:
    virtual void specific() { /* do nothing */ }
};

class B : public A {
public:
    virtual void specific() { /* do something B specific */ }
}

A * a;
if( a != 0 ) a->specific();

score 0 · Answer 4 · answered Aug 25 '14 at 23:44

When MSVC 2005 first came out, dynamic_cast<> for 64-bit code was much slower than for 32-bit code. We wanted a quick and easy fix. This is what our code looks like. It probably violates all kinds of good design rules, but the conversion to remove dynamic_cast<> can be automated with a script.

class dbbMsgEph {
public:
    virtual dbbResultEph *              CastResultEph() { return 0; }
    virtual const dbbResultEph *        CastResultEph() const { return 0; }
};

class dbbResultEph : public dbbMsgEph {
public:
    virtual dbbResultEph *              CastResultEph() { return this; }
    virtual const dbbResultEph *        CastResultEph() const { return this; }
    static dbbResultEph *               Cast( dbbMsgEph * );
    static const dbbResultEph *         Cast( const dbbMsgEph * );
};

dbbResultEph *
dbbResultEph::Cast( dbbMsgEph * arg )
{
    if( arg == 0 ) return 0;
    return arg->CastResultEph();
}

const dbbResultEph *
dbbResultEph::Cast( const dbbMsgEph * arg )
{
    if( arg == 0 ) return 0;
    return arg->CastResultEph();
}

When we used to have

dbbMsgEph * pMsg;
dbbResultEph * pResult = dynamic_cast<dbbResultEph *> (pMsg);

we changed it to

dbbResultEph * pResult = dbbResultEph::Cast (pMsg);

using a simple sed(1) script. And virtual function calls are pretty efficient.

score 0 · Answer 5 · answered Aug 25 '14 at 23:55

//in release module(VS2008) this is true：

cout << "Base Type: " << (b->getType() == Base::type()) << endl;

I guess it's because the optimization.So I change the implementation of Derived::type()

template <typename T>
struct Derived : public Base
{
    static intptr_t type() 
    { 
        cout << "different type()" << endl;
        return reinterpret_cast<intptr_t>(&type); 
    }
    virtual intptr_t getType() { return type(); }
};

Then it's different.So how to deal with it if use this method???

Are you perhaps seeing the effects of [identical](http://blogs.msdn.com/b/oldnewthing/archive/2005/03/22/400373.aspx) [COMDAT](http://msdn.microsoft.com/en-us/library/bxwfs976%28v=vs.80%29.aspx) folding? — chwarr, Aug 26 '14 at 00:10

Circumventing RTTI on legacy code

5 Answers5

Linked