3

I'm running into 2 distinct issues while using boost thread pool. I'm experimenting with my ECS Game engine framework.

Explanation: I'm trying to use 3 worker threads to update 3 systems concurrently- I know that this works because I've used my own thread pool implementation and wound up with perfect results, but only roughly 33% faster because of my use of atomics.

With a naive implementation I was able to achieve roughly 4x the speed of serial updating, however my main thread would occasionally proceed without letting every thread finish- so I'm trying to find middle ground using boost (since I already use it in my engine) but the documentation is surprisingly sparse.

The expected behavior is this:

  1. Update Logic
  2. Update Physics
  3. Update Graphics
  4. Wait for all jobs to finish and then proceed.

Here's my basic boost::asio::thread_pool use-case:

void Level::MultithreadedUpdate(const float& dt)
{
    dispatch(*threadpool, [&]() {MultithreadedUpdateLogic(dt); });
    dispatch(*threadpool, [&]() {MultithreadedUpdatePhysics(dt); });
    dispatch(*threadpool, [&]() {MultithreadedUpdateGraphics(dt); });
    threadpool->join();
}

Now I've tried this using post, dispatch and defer mostly because the documentation is identical.

But I've stuck with dispatch because of the answer to this question: Boost asio io_service dispatch vs post

This code didn't work, it actually ran into the same issues of not waiting for every thread to finish before proceeding, even though it's practically the exact same as their provided example use case. I tried using .stop() instead and that weirdly enough worked more frequently, but not 100%.

Now the second issue (the weird part):

After refactoring my code base and storing the thread pool inside my engine singleton, I am getting totally unexpected behavior from thread_pool. I'm unit testing the integrity of the updates by incrementing an integer stored on millions of components, thousands of times very quickly. The MultithreadedUpdate() runs once with success, and then every proceeding iteration it does not run the jobs at all.

Upon debugging the thread pool at run time, I can see that the scheduler_ (inside thread_pool) begins to accumulate outstanding work 1

Looking through boost's code, it looks like join() eventually calls stop_all_threads() which sets stopped_ to true, which prevents jobs from running any more. They don't provide access to anything that can restart it, and on successive iterations stopped_ never returns to false, preventing any job from being completed.

Does anyone have more information on how to use thread_pool or what may be causing this? I'm using boost 1.66.0.

(I'm currently building a fresh boost 1.67.0, hoping that it's an issue with my library download)

Edit (more information):

I'm getting the same results with 1.67.0 and I can see that the very first iteration does work as expected, all jobs completed and it completed in the correct order, without proceeding too early. The scheduler_.outstanding_work_ is set to 0 at the end of the update. Subsequent updates accumulate 2 at a time, i.e. at the end of each update, the outstanding_jobs_ counter += 2 every time, which is odd. I verified that it is in fact NOT updating any system, one of the jobs completes without doing anything at all, the other two simple sit in the queue.

I do notice here that scheduler_.task_interrupted_ is set to true as well, which contextually doesn't make sense to me, since I never called .stop()

Jon Koelzer
  • 350
  • 2
  • 9

0 Answers0