I have a task which is very well parallelizable so I would like to use multiple threads to speed up my program. However, it is not as simple as creating the threads and letting them run. The threads have to execute a certain task repeatedly with breaks inbetween, i.e. pseudo code would look like this:
loop
wake threads up
calculate x using the threads
pause threads
calculate something else without the threads
This happens very frequently, 60 times per second to be exact. That is why creating new threads every time would be by far too slow. I tried solving this with a state variable for each thread (Running, Paused, Stopped) and either an event-like construct with condition variables, or a polling mechanism.
Both of these only gave me about two times the speed which is not as much as I imagine to be possible, considering only about 5% of the time are spend within a critical section. (and my CPU offers 4 cores * 2 = 8 hyper threads)
I'd imagine the issue with the condition variable is that the wake up is not immediate but has some delay to it which means runtime wasted. The polling approach is slightly slower because, I guess, the code executed while the threads are paused will be slower because the threads are still using the CPU.
What would be the best way to implement what I have in mind?