I'm learning c++ and was making a realtime raytracer. First I used std::thread to spread the work, but turned out starting 32 threads every frame is way slower than the actual work that needs to be done.
Then I found C++ also uses threadpools to deal with that issue through std::async():
void trace()
{
constexpr unsigned int thread_count = 32;
std::future<void> tasks[thread_count];
for (auto i = 0u; i < thread_count; i++)
tasks[i] = std::async(std::launch::async, raytrace_task(sample_count, world_), 10, 10);
for (auto i = 0u; i < thread_count; i++)
tasks[i].wait();
}
and the raytrace_task is empty:
struct raytrace_task
{
// simple ctor omitted for brevity
void operator()(int y_offset, int y_count)
{
}
}
But this is just as slow as making your own threads. Each call to trace() takes about 30ms! Can anyone tell me what I'm doing wrong or how to reuse threads? aka: post many data-processing jobs to a single reused thread throughout time.
thank you