I want to avoid a race condition in parallel code. The issue is that my class contains several global variables, let's say just one x
for simplicity as well as a for
loop that I wish to make parallel. The actual code also has a method that takes a pointer to a class, in this case itself, as its argument, accessing even more global variables. So it might make sense to make the entire instance threadprivate.
I am using OpenMP.
A minimum working example is:
#include <iostream>
#include <omp.h>
class lotswork {
public:
int x;
int f[10];
lotswork(int i = 0) { x = i; };
void addInt(int y) { x = x + y; }
void carryout(){
#pragma omp parallel for
for (int n = 0; n < 10; ++n) {
this->addInt(n);
f[n] = x;
}
for(int j=0;j<10;++j){
std::cout << " array at " << j << " = " << f[j] << std::endl;
}
std::cout << "End result = " << x << std::endl;
}
};
int main() {
lotswork production(0);
#pragma omp threadprivate(production)
production.carryout();
}
My question is, how can I do this? Using the keyword threadprivate
returns the following compiler error message:
error: ‘production’ declared ‘threadprivate’ after first use
I think this compiler issue here still hasn't been solved:
This brings us to why I used the Intel compiler. Visual Studio 2013 as well as g++ (4.6.2 on my computer, Coliru (g++ v5.2), codingground (g++ v4.9.2)) allow only POD types (source). This is listed as a bug for almost a decade and still hasn't been fully addressed. The Visual Studio error given is error C3057: 'globalClass' : dynamic initialization of 'threadprivate' symbols is not currently supported and the error given by g++ is error: 'globalClass' declared 'threadprivate' after first use The Intel compiler works with classes.
Unfortunately, I haven't got access to Intel's compiler but use GCC 8.1.0. I did a bit of background research and found a discussion on this here, but that trail runs cold, ten years ago. I am asking this question because several people have had issues with this and solved it either by declaring a class pointer as here or proposing terrible workarounds. The latter approach seems misguided because a pointer is usually declared as a constant but then we have threadprivate
pointers while the instance is still shared.
Attempt at solution
I believe I can use the private
keyword but am unsure how to do this with an entire instance of a class although I'd prefer the threadprivate
keyword. A similar example to mine above on which I modeled my MWE has also been discussed in Chapter 7, Figure 7.17 in this book, but without solution. (I am well aware about the race condition and why it's a problem.)
If necessary I can give evidence that the output of the above programme without any extra keywords is nondeterministic.
Another attempt at solution
I have now thought of a solution but for some reason, it won't compile. From a thread-safety and logical standpoint my problem should be solved by the following code. Yet, there must be some sort of error.
#include <iostream>
#include <omp.h>
class lotswork : public baseclass {
public:
int x;
int f[10];
lotswork(int i = 0) { x = i; };
void addInt(int y) { x = x + y; }
void carryout(){
//idea is to declare the instance private
#pragma omp parallel firstprivate(*this){
//here, another instance of the base class will be instantiated which is inside the parallel region and hence automatically private
baseclass<lotswork> solver;
#pragma omp for
for (int n = 0; n < 10; ++n)
{
this->addInt(n);
f[n] = x;
solver.minimize(*this,someothervariablethatisprivate);
}
} //closing the pragma omp parallel region
for(int j=0;j<10;++j){
std::cout << " array at " << j << " = " << f[j] << std::endl;
}
std::cout << "End result = " << x << std::endl;
}
};
int main() {
lotswork production(0);
#pragma omp threadprivate(production)
production.carryout();
}
So this code, based on the definitions, should do the trick but somehow it doesn't compile. How can I put this code together so it achieves the desired thread-safety and compiles, respecting the constraint that threadprivate is not an option for non-Intel compiler folks?