2

I'm trying to implement a multithreaded game loop. I already did that but had to use a few locks for that, which ruined the performance. After researching a bit I came up with this idea:

Instead of splitting the engines subsystems into different threads (e.g. physics, animation), all subsystems run on all threads. So when we got four CPUs, four threads are created, with each thread having one loop for all subsystems. So the single-core gameloop is copied on all four threads. These gameloops are controlled by one other loop, which sends messages (or 'jobs', 'tasks') to one of these threads (depending on their usage) according to user-input or scripts. This could be done with a double buffered command buffer.

Only the rendering loop is alone in a thread for maximum rendering performance. Now I'm thinking of the best way to communicate with the rendering loop. The best idea I could come up with is to again use a command buffer and swap it when the rendering loop is complete. That way the rendering loop doesn't have to wait for any of the loops and can go on rendering. If the game loop hasn't finished when the rendering loop swapped the buffer, all commands after that will be executed in the next frame of the rendering loop. To make sure that all objects will be drawn even if the game loop hasn't finished, the rendering loop holds all objects that will be drawn and draws them until it gets the command to stop drawing them.

Example

My goal is to make the engine scalable to cpu numbers and make it to use all cores. Is this a way to do that? What is the best approach to this and how are modern engines handling this?

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • You should probably be aware that dependent on your rendering/event handling system you may need the rendering to take place in the same thread as the user input, or need all user i/o to take place in the main thread. SDL, for instance, [requires the first](https://wiki.libsdl.org/CategoryThread) and [`#define`s the main function](http://stackoverflow.com/questions/11976084/why-sdl-defines-main-macro) so it always initializes its systems before the user defined main function and exits them after the user defined main function, which might force use of the main thread by default, if I remember. – jaggedSpire Aug 24 '15 at 16:52
  • 4
    Might be a better question for gamedev.stackexchange.com? – Christian Hackl Aug 24 '15 at 16:58
  • Looks like a reasonable interim solution, DX11/12 and Vulkan may make you think twice about coalescing everything into one command stream though. The latter two things even let you schedule work on individual GPUs (GL already does this with some extensions only available on workstation-class hardware). Discrete GPUs will have NUMA and you would probably not want to distribute the same tasks across all of them if you want to utilize memory efficiently - that would be no better than the current mess we have with CrossFire and SLI. – Andon M. Coleman Aug 24 '15 at 18:12
  • There is a nice presentation that briefly touches on this [here](http://media.steampowered.com/apps/valve/2015/Pierre-Loup_Griffais_and_John_McDonald_Vulkan.pdf). The future of all these APIs looks to be keeping commands parallel and using GPU-side synchronization. – Andon M. Coleman Aug 24 '15 at 18:26
  • I know the new features of Vulkan and DX12. Although I would really like to start using them, they are not released yet as far as I know. –  Aug 24 '15 at 19:28

0 Answers0