0

I have a function memoize that reads and writes a static std::map, eg:

int memoize(int i)
{
static std::map<int,int> map;
const auto iterator = memoizer.find(i);
if (iterator == memoizer.end())
  memoizer[i]=i+1;
else
  return iterator->second;
}

memoize is called by other functions which are called by other functions which are called.... by functions in main. Now if in main I have something like

#pragma omp parallel for
for (int i=0; i<1000; i+++)
  f(i); \\function calling memoize

for (int i=0; i<1000; i+++)
  g(i); \\function calling memoize

then I have a problem in the first loop since std::map is not thread safe. I am trying to figure out a way to lock the static map only if openmp is used (thus only for the first loop). I would rather avoid having to rewrite all functions to take an extra omp_lock_t argument.

What's the best way to achieve that? Hopefully with the least amount of macros possible.

user12005284
  • 17
  • 1
  • 5
  • Consider having two separate functions: one thread-safe and one not. – freakish Jan 05 '20 at 17:23
  • @freakish Is there a thread-safe alternative to ```std::map```? And how do I detect if I'm in an openmp region from within? – user12005284 Jan 05 '20 at 17:26
  • What I propose is to not detect anything. You write two functions and it is up to the caller to know which one to use. The simplest thread safety in your case would be by locking a static [std::mutex](https://en.cppreference.com/w/cpp/thread/mutex). There are more efficient concurrent maps though, you'd have to google it. – freakish Jan 05 '20 at 17:27
  • @freakish The caller doesn't call these functions directly, but other functions calling functions.... calling the two alternatives. Hence the "nested" in the title – user12005284 Jan 05 '20 at 17:29
  • AFAIK functions are compiled only once. And what you want is a variant depending on where we are during compilation. I don't think this is possible, I might be wrong though. In that situation the only sensible way is to be thread-safe all the time. With a proper algorithm (see some interesting answers [here](https://stackoverflow.com/questions/14338732/is-it-possible-to-implement-lock-free-map-in-c)) you probably won't feel it. – freakish Jan 05 '20 at 17:34
  • 1
    If the memoization is worth the time that synchronization with multiple threads would cost, then it is probably also fine if a single thread locks a mutex (even if not needed). – walnut Jan 05 '20 at 17:36
  • Otherwise I would suggest just making the `memoization` thread local and merge the thread local maps together occasionally to reduce the synchronization cost. Depends on what exactly you are doing though. It is not clear to me from your code that `memoize` actually does memoization. – walnut Jan 05 '20 at 17:42
  • @walnut The example above just used a trivial ```memoize```, in reality the ```memoize[i]=i+1;``` calls the function that we are memoizing. – user12005284 Jan 05 '20 at 17:50
  • @user12005284 ok, then you need to make sure that you unlock the mutex before executing the function that is supposed to be memoized. – walnut Jan 05 '20 at 17:59

1 Answers1

3

A very simple solution to your problem would be to protect both the read and the update parts of your code with a critical OpenMP directive. Of course, to reduce the risk of unwanted interaction/collision with some other critical you might already have somewhere else in your code, you should give it a name to identify it clearly. And should your OpenMP implementation permit it (ie the version of the standard being high enough), and also if you have the corresponding knowledge, you can add a hint on how much of contention you expect from this update among the threads for example.

Then, I would suggest you to check whether this simple implementation gives you sufficient performance, meaning whether the impact of the critical directive isn't too much performance-wise. If you're happy with the performance, then just leave it as is. If not, then come back with a more precise question and we'll see what can be done.

For the simple solution, the code could look like this (with here a hint suggesting that high contention is expected, which is only a example for your sake, not a true expectation of mine):

int memoize( int i ) {
    static std::map<int,int> memoizer;
    bool found = false;
    int value;
    #pragma omp critical( memoize ) hint( omp_sync_hint_contended )
    {
        const auto it = memoizer.find( i );
        if ( it != memoizer.end() ) {
           value = it->second;
           found = true;
        }
    }
    if ( !found ) {
        // here, we didn't find i in memoizer, so we compute it
        value = compute_actual_value_for_memoizer( i );
        #pragma omp critical( memoize ) hint( omp_sync_hint_contended )
        memoizer[i] = value;
    }
    return value;
}
Gilles
  • 9,269
  • 4
  • 34
  • 53
  • Don't you still have a data race, since one thread could be calling `find` while the other is inserting? Also OP indicated that `i+1` represents the actual time-intensive task, so I guess that should not be part of the critical section. It may be obvious how to resolve this, but I guess it would make a clearer answer if you showed how that can be done. – walnut Jan 06 '20 at 08:30
  • Indeed there is a race condition between the reads and the writes in the map. ATM, the reading part needs to be protected as well as the writing part might modify it while accessed for reading from another thread. However, even with the reads protected, there will still be some races between threads to be the first to create the ith entry. Is that a true issue, well hard to tell... I'll fix the code – Gilles Jan 06 '20 at 08:50
  • I am not concerned about the race condition that might waste time doing calculations multiple times. I am concerned about undefined behavior. At least from the C++ standard point-of-view calling `find` and `operator[]` unsynchronized is a data race and causes undefined behavior. I don't know enough about OpenMP to know whether it puts any other requirements. – walnut Jan 06 '20 at 08:57
  • likewise. that is why I didn't even try to avoid the possible multiple computations and updates of the map for a given value as they will soon disappear. However, now that the reads are protected as well, the impact on performance might become quite high... To be seen by the OP. If needed, I can think of a 2-level solution with atomics which would be less costly – Gilles Jan 06 '20 at 09:02
  • At second thought, I realized that the access to `iterator->second()` and even `memoizer.end()` needed to be protected as well. So I rewrote the code in a way I think is now bulletproof. Whether it is effective is another question though. – Gilles Jan 06 '20 at 09:36
  • Are return statements inside a critical section not allowed? It would allow eliminating `found`. – walnut Jan 06 '20 at 09:39
  • unfortunately, it isn't. – Gilles Jan 06 '20 at 09:41
  • Thanks for your answer. I tried both critical and locking and unfortunately performance was not good enough, to the point where not using the memoizer was the better solution. The function being memoized is quite complicated, hence why I did not want to update the Question. But thanks anyway! – user12005284 Jan 08 '20 at 20:08