I have an application in which I perform costly calculations in parallel worker threads. For simplicity, I write results to stdout directly from these threads.
This worked fine until I changed a few things in an attempt to make the code run faster. First, I replaced std::endl with "\n" to prevent a flushing after every line. And I added the following lines to the init part of my main program:
std::cin.tie(nullptr);
std::ios_base::sync_with_stdio(false);
The basic structure of the worker thread code looks like this:
while(true) {
// get data from job queue, protected by unique_lock on std::mutex
// process the data
// print results
{
std::lock_guard<std::mutex> lk(outputMutex_);
std::cout << "print many results" << "\n"; // was originally std::endl
}
}
Since this "optimization", the output of the workers occasionally "mixes". i.e. the mutex does not serve its intended purpose.
Why is this happening? My understanding is that there is just a single stdout stream buffer, and that the data arrives in the corresponding buffer in sequence, even if the output is not flushed from this buffer before releasing the mutex. But that does not seem to be the case...
(I realize that maybe it would be nicer to have the output generated in a separate thread, but then I'd need to pass back these results using another queue, which did not seem necessary here)
Update: Maybe my post was not clear enough. I do not care about the sequence of the results. The problem is that (for the example above) instead of this:
print many results
print many results
print many results
I sometimes get:
print many print many results
results
print many results
And the outputMutex_ is a static member that is shared by all worker threads.