10

I want to avoid a race condition in parallel code. The issue is that my class contains several global variables, let's say just one x for simplicity as well as a for loop that I wish to make parallel. The actual code also has a method that takes a pointer to a class, in this case itself, as its argument, accessing even more global variables. So it might make sense to make the entire instance threadprivate. I am using OpenMP.

A minimum working example is:

#include <iostream>
#include <omp.h>
class lotswork {
public:
    int x;
    int f[10];

    lotswork(int i = 0) { x = i; };

    void addInt(int y) { x = x + y; }

    void carryout(){

        #pragma omp parallel for
        for (int n = 0; n < 10; ++n) {
            this->addInt(n);
            f[n] = x;
        }
        for(int j=0;j<10;++j){
            std::cout << " array at " << j << " = " << f[j] << std::endl;
        }
        std::cout << "End result = " << x << std::endl;
    }
};



int main() {
    lotswork production(0);
    #pragma omp threadprivate(production)
    production.carryout();

}

My question is, how can I do this? Using the keyword threadprivate returns the following compiler error message: error: ‘production’ declared ‘threadprivate’ after first use I think this compiler issue here still hasn't been solved:

This brings us to why I used the Intel compiler. Visual Studio 2013 as well as g++ (4.6.2 on my computer, Coliru (g++ v5.2), codingground (g++ v4.9.2)) allow only POD types (source). This is listed as a bug for almost a decade and still hasn't been fully addressed. The Visual Studio error given is error C3057: 'globalClass' : dynamic initialization of 'threadprivate' symbols is not currently supported and the error given by g++ is error: 'globalClass' declared 'threadprivate' after first use The Intel compiler works with classes.

Unfortunately, I haven't got access to Intel's compiler but use GCC 8.1.0. I did a bit of background research and found a discussion on this here, but that trail runs cold, ten years ago. I am asking this question because several people have had issues with this and solved it either by declaring a class pointer as here or proposing terrible workarounds. The latter approach seems misguided because a pointer is usually declared as a constant but then we have threadprivate pointers while the instance is still shared.

Attempt at solution

I believe I can use the private keyword but am unsure how to do this with an entire instance of a class although I'd prefer the threadprivate keyword. A similar example to mine above on which I modeled my MWE has also been discussed in Chapter 7, Figure 7.17 in this book, but without solution. (I am well aware about the race condition and why it's a problem.)

If necessary I can give evidence that the output of the above programme without any extra keywords is nondeterministic.

Another attempt at solution

I have now thought of a solution but for some reason, it won't compile. From a thread-safety and logical standpoint my problem should be solved by the following code. Yet, there must be some sort of error.

#include <iostream>
#include <omp.h>
class lotswork : public baseclass {
public:
    int x;
    int f[10];

    lotswork(int i = 0) { x = i; };

    void addInt(int y) { x = x + y; }
    
        void carryout(){
    //idea is to declare the instance private
    #pragma omp parallel firstprivate(*this){
    //here, another instance of the base class will be instantiated which is inside the parallel region and hence automatically private
    baseclass<lotswork> solver;

  #pragma omp for
  for (int n = 0; n < 10; ++n) 
      {
          this->addInt(n);
          f[n] = x;
          solver.minimize(*this,someothervariablethatisprivate);
      }
                                             } //closing the pragma omp parallel region
                for(int j=0;j<10;++j){
                    std::cout << " array at " << j << " = " << f[j] << std::endl;
                }
                std::cout << "End result = " << x << std::endl;
            }
        };
        
        
        
    int main() {
        lotswork production(0);
        #pragma omp threadprivate(production)
        production.carryout();
    
    }

So this code, based on the definitions, should do the trick but somehow it doesn't compile. How can I put this code together so it achieves the desired thread-safety and compiles, respecting the constraint that threadprivate is not an option for non-Intel compiler folks?

halfer
  • 19,824
  • 17
  • 99
  • 186
Hirek
  • 435
  • 3
  • 12
  • I put a fair amount of research into this question and it seems to concern a lot of people. Rather than anonymously, and unhelpfully, downvote, why not provide heads-up on what should be improved? – Hirek Jan 02 '19 at 23:31
  • 4
    Not the downvoter, but it would be helpful if you included a [MCVE], enough to make what you believe is a self-contained, correct piece of code that *should* be compilable but isn't. – ShadowRanger Jan 03 '19 at 00:08
  • Fair does, I changed the question slightly. – Hirek Jan 03 '19 at 14:16
  • Please give some [MCVE] in your question. What actual C++ code have you compiled, and with what exact compilation command? – Basile Starynkevitch Jan 03 '19 at 14:17
  • will do in a moment though the second link has one, too. – Hirek Jan 03 '19 at 14:20
  • @BasileStarynkevitch I have added the MWE. About the compiler, I use the -fopenmp flag and updated to the latest version of gcc. However, I use clion and cmake, so they're all a bit lengthy. The program compiles fine without the threadprivate directive. – Hirek Jan 03 '19 at 16:42
  • It would already be helpful if someone with an Intel compiler could compile this and see if it works, but I think I need some help on how to handle the instance as a private variable. – Hirek Jan 03 '19 at 17:28
  • 2
    Could you explain why you want to preserve private variables between parallel regions? I usually can find a workaround for this and I almost never need thread private. Explain what you want to do. – Z boson Jan 04 '19 at 08:38
  • You're right, I don't need preservation between regions because there is only one. I'm simply looking to make the instance 'production' private rather than selected public variables because my class inherits from an optimiser class which has a call bfgsSolver(*this, param) which in turn calls a method of my class that uses public variables I know and could declare private manually. However given there is another invocation of the current instance via *this it is likely to use more public variables particular to the optimiser class. To be safe I therefore wish to make the whole instance private – Hirek Jan 04 '19 at 15:21
  • @Z boson further there is only one parallel for loop so yes I definitely don't need preservation but so far struggled with the private/first private constructs where entire instances are concerned. Threadprivate seemed easier were it not for gcc's lack of dynamic support. So if you could help me how to use the threadprivate functionality id much grateful. This question / answer would also then be a good workaround to point others to. – Hirek Jan 04 '19 at 15:24
  • Im sorry I meant to say 'to use the firstprivate functionality' as it pertains to my instance as a workaround for threadprivate. – Hirek Jan 05 '19 at 00:41
  • Hm maybe to be clearer, I would like to have a parallel for loop within a method of my class that uses public/global variables. Moreover, my class inherits from a class https://github.com/PatWie/CppNumericalSolvers/blob/master/README.md from which I use a method that invokes the current instance via *this. I am unsure about the thread-safety but know that thanks to my public variables, my method is not thread-safe for sure. Hence, Id like to make my whole instance private. @Z boson – Hirek Jan 05 '19 at 00:50
  • 1
    Are you sure you don't mix-up the `private` from C++ which corresponds to "visibility" within the class hierarchy and the `private` from OpenMP which corresponds to be replicated on a per-thread basis? Both are completely orthogonal concepts. And so far, I really don't understand what's your problem and what you're trying to achieve... – Gilles Jan 05 '19 at 14:27
  • I am pretty sure, yes. I could declare the public variables private to prevent other classes seeing them but the issue here is that my method 'carryout' once operating on two or more concurrent threads will start reading from and writing to global variables. In the real implementation that I have, my class encapsulates an optimisation problem that computes starting values and an objective function value along with several intermediate results which I would like to repeat 100k or more times, in parallel. Each repetition or for loop execution would involve read/write access. – Hirek Jan 05 '19 at 14:41
  • to several variables that would then be treated as 'shared' by default. So while starting values are written for thread a, thread b would get around to doing the same, overwrite them. Meanwhile thread a picks up the overwritten starting values instead of the proper ones... etc. and if this was one or two vars, I would say fine I will write it out, but there are about 30 plus those that come from the optimiser class itself which I don't even now (it's thousands of lines of code). – Hirek Jan 05 '19 at 14:48
  • Hence the need to declare the instance private, which if gcc did not have that lacking functionality would be a charm using threadprivate. As things stand, I am stuck with just 'private' and don't know how to apply this to an instance when declaring private happens within one of the methods. I guess I could do it in 'main' but using something like pragma omp single firstprivate(myinstance) and then include the omp pragma parallel for where applicable and be done? Anyhow, thanks for listening and any support. @Gilles – Hirek Jan 05 '19 at 14:49
  • A class cannot "contain global variables"; that literally makes no sense. I agree with Gilles - it's really hard to see what you're trying to ask here, and what problem you perceive that needs solving, sorry. – Lightness Races in Orbit Jan 12 '19 at 18:09

2 Answers2

4

This is a long-standing missing GCC feature:

With current GCC versions, thread_local is expected to work, though:

int main() {
  thread_local lotswork production(0);
  production.carryout();
}

However, I do not think this will work in your case because the parallel loop in carryout will still operate on a single lotswork instance. I believe this would apply to the original code using threadprivate, too. You probably need to move the parallel loop outside of the carryout member function.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • Thanks for pointing this out. I might just bite the bullet and declare a laundry list of variables private for my one measly region. – Hirek Jan 06 '19 at 18:37
4

It seems like there is some confusion about OpenMP constructs here. threadprivate is used, much like thread_local, to create a per-thread copy of an object of static lifetime, either a global or a static variable. As noted, there are some implementation issues with this, but even if the implementations could handle the class, using threadprivate on a non-static local variable would produce an error.

As to the error, it's hard to say without output, but it is likely multiple things:

  1. The unmatched closing brace. Placing a { on the end of a pragma line does not open a block, it needs to be on the following line.
  2. It is not valid to privatize an enclosing class instance that way

If you need to create a private copy of the enclosing class in each thread, it's possible by either copy-constructing the class into a variable declared inside a parallel region:

#pragma omp parallel
{
  lotswork tmp(*this);
  // do things with private version
}

Note however that the entire thing is private, so this means that f in the original copy will not be updated unless you perform the addInt equivalents all on the private copies then the f[n] assignments on the original.

Edit: I originally mentioned using the default(firstprivate) clause, but the default clause only offers private and first private for FORTRAN. To get the same effect in c++, do the above and copy construct into a new instance of each, or use a lambda with capture by value by default then firstprivate that, *this requires c++17 to work, but does exactly what's requested:

auto fn = [=,*this](){
  // do things with private copies
  // all updates to persist in shared state through pointers
};
#pragma omp parallel firstprivate(fn)
fn();
Tom Scogland
  • 937
  • 5
  • 12
  • 'default(firstprivate)' won't work or at least CLion complains saying it's expecting 'none' or 'shared.' I shall read the OpenMP doc to see about your suggestion that "it is possible to make private or firstprivate the default" – Hirek Jan 11 '19 at 13:56
  • soo I checked and what you say is true in general only for Fortran, see here https://computing.llnl.gov/tutorials/openMP/#DEFAULT although there is a mysterious note that "[h]owever, actual implementations may provide this option." – Hirek Jan 11 '19 at 13:59
  • Yup, won't compile and gives the error "expected 'non' or 'shared' before 'firstprivate'." I guess the good folks of LLNL are referring to custom-built versions of OpenMP? – Hirek Jan 11 '19 at 14:02
  • You’re quite right. Even those of us who help write the thing make these mistakes sometimes. Answer edited. – Tom Scogland Jan 12 '19 at 04:50