6

I'm trying to implement a component based architecture on a game engine project. Each GameObject has an unordered_map that holds a pointer to Component base class. At this point, I only have one component derived class, which is the Transform class. I wanted to implement this component based architecture similar to the Unity's convention: I want to get a component of game object by calling the member template function like GetComponent<Transform>().

Here are the headers:

Component.h

enum type{
    TRANSFORM   // more will be added later
};

class Component // base class
{
public:
    Component() : _owner(NULL) {}
    virtual ~Component(){}
    static type Type;

protected:
    GameObject* _owner;
};

Transform.h

class Transform : public Component
{
public:
    Transform();
    ~Transform();
    static type Type;

    void Rotate(float deg);

    // to be encapsulated later on
    Vector2D _position;
    float _rotation;
    Vector2D _scale;
};

GameObject.h

class GameObject
{
public:
    GameObject();
    ~GameObject();

    void Update();
    //and more member functions

    template<class T>
    T* GetComponent();
private:
    // some more private members
    unordered_map<type, Component*> _componentList; // only 1 component of each type
};

template<class T>
T* GameObject::GetComponent()
{       
    return static_cast<T*>(_componentList[T::Type]);
}

My initial implementation used std::vector for keeping Component* and the application ran at 60 fps (I also have a frame rate controller, which just limits the FPS to 60). When I changed to the unordered_map for accessing those component pointers, the performance went downhill to 15 FPS.

enter image description here

I only draw two quads and I call GetComponent<Transform>() only 6 times per frame at this point, so there is not much going on in the scene.


What I tried?

I tried to use const char*, std::string, type_info and finally enum type as key values for the unordered_map but nothing really helps: all implementations got me 15-16 FPS.

What causes this performance issue? How can I isolate the issue?

I hope I provided enough detail, feel free to ask for more code if necessary

Varaquilex
  • 3,447
  • 7
  • 40
  • 60
  • 5
    Entities usually only have few components, maybe 5 or at most 30. For 30 elements a vector with linear search outperforms a hash_map. I would just stick to vectors. – nwp Oct 29 '15 at 09:57
  • 1
    I don't know anything about game programming, but couldn't you instantiate a pointer array with always num_components elements? The type enum values can be cast as indices into the array, and if you have few components it might not be a high memory overhead – dgel Oct 29 '15 at 10:13
  • 1
    what else did you change? I don't believe you *only* changed the storage container of your components. calculating a hash value is trivial. lower than the cost of sending draw instructions to a GPU. – Richard Hodges Oct 29 '15 at 10:34
  • When this happens I usually assume something is being copied that shouldn't be being copied. It's unlikely to be in any of the code you've posted here. – Robinson Oct 29 '15 at 10:38
  • 1
    @RichardHodges seriously, i barely have any components. I just change the container type. If I just revert it back to `vector` again by making changes in GameObject.h and GameObject.cpp (In constructor, I simply add one Transform component to the container : `//_componentList.push_back(new Transform());` to `_componentList.emplace(Transform::Type, new Transform());` ) – Varaquilex Oct 29 '15 at 10:39
  • 1
    if what you say is true, i guess you'll need to profile it and see what's going on. the bug *will not* be in the standard library. On a separate note, for goodness' sake please store the pointer in the map/vector with a unique_ptr or shared_ptr. Seeing raw pointers in containers makes everyone very sad :( – Richard Hodges Oct 29 '15 at 11:11

1 Answers1

1

Disclaimer: while the information below should hold regardless, the first basic sanity check given such a drastic difference in performance over such a simple case is to first make sure that build optimizations are on. With that out of the way...

unordered_map is designed ultimately as a fairly large-scale container, preallocating a potentially large number of buckets.

See here: std::unordered_map very high memory usage

And here: How does C++ STL unordered_map resolve collisions?

While computing a hash index is trivial, the amount of memory (and strides between) being accessed for such small unordered_maps could very easily turn into a cache-missing bottleneck with something as accessed frequently as retrieving a component interface out of an entity.

For entity-component systems, typically you don't have that many components associated to an entity -- maybe something like dozens tops, and often only a few. As a result, std::vector is actually a far more suitable structure and primarily in terms of locality of reference (small arrays that can often be accessed repeatedly every time you fetch a component interface out of an entity). While a lesser point, std::vector::operator[] is also a function that is trivially-inlined.

If you want to do even better than std::vector here (but I only recommend this after profiling and determining you need it), provided that you can deduce some common case upper bound, N, for the number of components typically available in an entity, something like this may work even better:

struct ComponentList
{
    Component* components[N];
    Component** ptr;
    int num;
};

Start off by setting ptr to components and access them subsequently through ptr. Inserting a new component increases num. When num >= 4 (rare case), change ptr to point to a dynamically-allocated block with a larger size. When destroying ComponentList, free the dynamically-allocated memory if ptr != components. This wastes a little memory if you're storing less than N elements (though std::vector also typically does this with an initial capacity and the way it grows), but it turns your entity and the component list provided into a completely contiguous structure unless num > N. As a result, you get the best locality of reference and potentially even better results than what you started with (I'm assuming from the significant drop in frame rates that you were fetching components out of entities quite frequently out of loops which is not uncommon in ECS).

Given how frequently component interfaces can be accessed from an entity, and very often in very tight loops, this might be worth the effort.

Nevertheless, your original choice of std::vector was actually a better one given the typical scale of the data (number of components available in an entity). With very tiny data sets, basic linear-time sequential searches often outperform more sophisticated data structures, and you often want to instead focus much more on memory/cache efficiency.

I tried to use const char*, std::string, type_info and finally enum type as key values for the unordered_map but nothing really helps: all implementations got me 15-16 FPS.

Just on this note, for keys you want something that can be compared in constant time like an integral value. One possibility that might be handy in this case is an interned string which just stores an int for a balance between convenience and performance (allowing clients to construct them through a string but compare them through an int during component lookups).

Community
  • 1
  • 1
  • 1
    Thanks for the detailed answer. I forgot to provide an answer to this post. The problem was when I was looping through game objects in the update function, `for(auto obj : gameObjecList)`, I did not create references `for(auto& obj : gameObjecList)` but copies, hence all the ctor/dtor calls and low fps. Nevertheless, I'm glad you pitched in. – Varaquilex Nov 27 '15 at 01:15
  • Ah I see, if you include copying overhead on top of that along with the rather largish initial `unordered_map` size, that would exacerbate the issue quite a bit. But that kind of relative performance you see between `unordered_map` and `vector` should also show up even without the copy, once you reach a large enough scale. If scalability is the focus, it helps here to focus on cache-friendly memory access which you can get if you use smaller containers here for fetching components, and ones that search for them in a more sequential, contiguous kind of fashion. –  Nov 27 '15 at 09:37