I am using std::async
with std::launch::async
to initialize some threads and parallelize calling a rendering function that takes in two integers (i.e. x
and y
of a pixel in an image) and does lots of other computation and eventually outputs some values (i.e. pixel values). Just for more context: the function I'm talking about has lots of variable declarations and other function calls.
I noticed that when I use std::async
and call that function and is called lots of times, I don't get very accurate results anymore. In addition, my program runs faster without using std::async
. I'm a novice to threading in C++ but it looks like, despite the fact that the threads are supposed to behave asynchronously and independently, they might sometimes try to access each others' resources (e.g. memory and CPU cores). Therefore, they might sometimes overwrite some stuff in memory, leading to inaccurate results.
I wonder, how can I guarantee that I always get accurate results when calling a function via std::async
? I was searching for an answer and came across this and this posts and learned that I should probably be using something called atomic
variables in order to make sure my threads do not use each others' resources and also run in parallel (instead of running sequentially). However, I could not find a good, clear example that shows how people use atomic variables to achieve this. Should the atomic variable be used inside my function or when I am creating the threads (e.g. the parameters I pass to the function)? So I would appreciate it if someone can provide an example of this.
Below you can see a simplified example of my code but which is very similar to this answer in terms of calling a member function of a class instance.
#include <future>
#include <iostream>
class Render {
public:
std::vector<float> render (int x,int y);
std::vector<float> m_pixelValue;
int m_width=800;
int m_height=600;
};
std::vector<float> Render::render (int x, int y) {
std::vector<float> currentPixelData;
// do lots of work here and update currentPixelData
m_pixelValue = currentPixelData;
return m_pixelValue;
}
int main()
{
std::vector<float> pixelResult;
std::vector<std::future<void>> threads;
threads.reserve(5);
std::unique_ptr<Render> renderInstance(new Render());
for (int x=0; x<width;x++){
for (int y=0;y<height;y++){
threads.push_back(std::async(std::launch::async, &Render::render, &renderInstance, x, y));
if(threads.size == 5){
for (auto &th: threads){
pixelResult = th.get();
canvasData.update(pixelResult);
}
threads.clear();
}
}
}
}
Here's what I get when I run the function sequentially, without async
: