3

I've been struggling with a rather baffling problem: I occasionally get segmentation faults when my entity manager iterates through the map of entities in the update loop. The strange thing is that this doesn't happen all the time; sometimes it will crash on loading and sometimes I can switch between app states (and load and unload entities many times) a few times before I get the segfault. I also seem to get more segfaults in debug mode. My entities consist of pointers to a Behavior and Drawable class.

My call stack after the segfault:

#0 6FCB4986 libstdc++-6!_ZSt18_Rb_tree_incrementPSt18_Rb_tree_node_base() (C:\MinGW\bin\libstdc++-6.dll:??)
#1 0040A1D7 std::_Rb_tree_iterator<std::pair<unsigned int const, Entity*> >::operator++(this=0x28fe94) (c:/mingw/bin/../lib/gcc/mingw32/4.6.2/include/c++/bits/stl_tree.h:196)
#2 00401F55 EntityManager::onLoop(this=0x417238) (C:\Users\Nelarius\Documents\Kurssit\Miinaharava\src\engine\EntityManager.cpp:75)
#3 00401640 App::onLoop(this=0x417040) (C:\Users\Nelarius\Documents\Kurssit\Miinaharava\src\engine\App.cpp:38)
#4 0040160C App::execute(this=0x417040) (C:\Users\Nelarius\Documents\Kurssit\Miinaharava\src\engine\App.cpp:30)
#5 00403BD7 main(argc=1, argv=0x642908) (C:\Users\Nelarius\Documents\Kurssit\Miinaharava\src\main.cpp:15)

Here's my update loop:

void EntityManager::onLoop()
{
    std::map<const unsigned int, Entity*>::iterator it;

    for (it = _gameObjects.begin(); it != _gameObjects.end(); it++)
    {
        Behavior* behavior = it->second->getBehavior();
        if (behavior)
        {
            behavior->update();
        }
    }
}

I get the segfault at

for (it = _gameObjects.begin(); it != _gameObjects.end(); it++)

By the way, is it normal for there to be two threads when I'm not using any multithreading? I was looking at the Code::Blocks debug windows, and happened to see that there were two threads in the thread watch window (only one of which was active though).

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Nelarius
  • 181
  • 2
  • 10
  • 3
    Welcome to the car crash that is naked pointers. – Kerrek SB Jan 28 '13 at 13:25
  • 1
    Do you have a pair in the map where Entity (i.e. it->second) is actually NULL? However, this sounds like memory corruption. But you need to tell us more. – Marius Bancila Jan 28 '13 at 13:32
  • 1
    @MariusBancila: not necessarily NULL, that would be quite easy to detect. Dangling would be worse. He is using raw pointers, so it's possible that a pointer is not NULL, yet pointing to a destroyed object. Dereferencing such a pointer is Undefined Behavior. This is why smart pointers should be used. – Andy Prowl Jan 28 '13 at 13:35
  • I guarantee you it's not "random". – Lightness Races in Orbit Jan 28 '13 at 13:44
  • I was hoping that my entity manager class would not be able to reference dangling pointers. I just added a check for it->second == NULL, but it doesn't seem to catch anything, and I still get segfaults at `for (it = _gameObjects.begin(); it != _gameObjects.end(); it++)` – Nelarius Jan 28 '13 at 13:48
  • Yeah well programs are deterministic so it's not random, there just isn't any pattern to the segmentation faults that I've been able to see yet :D – Nelarius Jan 28 '13 at 13:49
  • @Nelarius: It'll be deterministic at the level of what happens to be in various points of memory so it would be very difficult to actually predict :P Possible, though, if you tracked every read/write and dereference combination! – Lightness Races in Orbit Jan 28 '13 at 14:58

2 Answers2

2

Often this sort of thing comes down to behavior->update() being able to result, through a sequence of nested function calls, in the _gameObjects container being modified (for example if some condition that is detected within the game object results in the creation or removal of game objects for whatever reason).

That can invalidate your iterator and break your loop if you removed an element from the map, and can be hard to spot in "kernel" code like this.

A common solution is to copy the list of game objects for the loop. You wouldn't copy the objects themselves, of course, but you are protecting the list of them from being mutated in the middle of an update run.

It's also "fairer" in terms of scheduling — you essentially avoid the possibility of an DDoS attack launched by disgruntled self-replicating game objects. :)

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • You are absolutely correct, and I feel kind of stupid now for not seeing this. I'm not exactly sure how the map was getting mutated during iteration, but this must have been the case. Iterating through a local copy does not result in segfaults. – Nelarius Jan 28 '13 at 14:00
  • "That invalidates your iterator and breaks your loop" -- Iterators in `map`s are not invalidated by someone else adding an element to the `map` [23.1.2/8]. – Yakk - Adam Nevraumont Jan 28 '13 at 18:31
  • @Yakk: Adding is not the only possibility. The "current" element could be removed. Please see [this question](http://stackoverflow.com/questions/6438086/iterator-invalidation-rules) for all the iterator invalidation rules. – Lightness Races in Orbit Jan 29 '13 at 03:10
  • @Non-StopTimeTravel see edit: saying "as an example, you can add elements", followed by "that can invalidate your iterator", isn't well worded. – Yakk - Adam Nevraumont Jan 29 '13 at 05:12
0

You are iterating over non local data and then call a non local function.

Create a local map. swap the class map into it. Iterate over that local map. Assert the class map is umchanged after iteration is done (you also need to assert that all attempts to remove stuff from the class map succeed).

Then swap the local map back to the class's map.

This will tell you if your crashes where caused by your bad design. The design remains problematic even if this doesn't generate any asserts, because innocuous code changes far away from the above code could cause problems like the above to occur: non-locality of code correctness causes problems. Non-locality of code correctness with asserts can be tolerable with sufficient testing.

The general callback problem requires you to make compromises about what it means to be a registered callback, and involves thinking about the problem in a way analogous to multi-threaded code. Code complexity can get really high if you aren't willing to compromise. Use of smart pointers to simplify things is advised.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524