2

I have a program which uses a lot of std::map structures. Now I want to use them with with multiple threads and assume that inserting or deleting keys will potentially alter the whole data structure and break it in parallel. But when I do not add new keys, it should be fine, right?

The following program shows what I want to do:

#include <omp.h>                                                                

#include <iostream>                                                             
#include <map>                                                                  

int main(int const argc, char const *const *const argv) {                          
  // Take a map and allocate the elements, but not fill them at this point.        
  std::map<int, int> map;                                                          
  int size = 10000;                                                                
  for (int i = 0; i < size; ++i) {                                                 
    map[i];                                                                        
  }                                                                                

  // Go through the elements in parallel and write to them, but not create any  
  // new elements. Therefore there should not be any allocations and it should  
  // be thread-safe.                                                               
#pragma omp parallel                                                               
  {                                                                                
    int const me = omp_get_thread_num();                                           
#pragma omp for                                                                    
    for (int i = 0; i < size; ++i) {                                               
      map[i] = me;                                                                 
    }                                                                              
  }                                                                                

  // Now all threads access all the elements of the map, but as the map is not  
  // changed any more, nothing bad should happen.                               
#pragma omp parallel                                                               
  {                                                                                
    int const me = omp_get_thread_num();                                           
    int self = 0;                                                                  

    for (int i = 0; i < size; ++i) {                                               
      if (map[i] == me) {                                                          
        ++self;                                                                    
      }                                                                            
    }                                                                              

#pragma omp critical(cout)                                                         
    std::cout << "Thread " << me << " found " << self << " entries.\n";            
  }                                                                                
}

Then I compile it with the following:

$ g++ -fopenmp -O3 -Wall -Wpedantic -g -fsanitize=address -o concurrent-map concurrent-map.cpp

This seems to work just fine with four threads. If I comment out the first for loop and let the threads populate the map, it crashes with segmentation faults, as I expect.

Of course I cannot prove that std::map is thread-safe in the way I think this way, but it at least does not prove the negative. Can I use the std::map in this fashion in parallel?

Martin Ueding
  • 8,245
  • 6
  • 46
  • 92
  • If performance is important, you almost always would want to use `std::unordered_map`. – nada Nov 04 '19 at 11:59
  • 5
    This may be helpful : https://stackoverflow.com/questions/15067160/stdmap-thread-safety. I feel that what you are doing is not thread safe. – coincoin Nov 04 '19 at 12:01
  • 2
    `operator[]` is not `const` for `std::map`, so implementation is allowed to be non-threadsafe. Better alternative is to use `map.find(i)->second`. – smitsyn Nov 04 '19 at 13:23

2 Answers2

3

I don't think using map[i] specifically is thread safe for all C++ implementations, even if it is not inserting a new element. The standard does not require operator[] to be data race free for associative containers:

Section [container.requirement.dataraces]/1 of the C++17 standard draft contains a list of functions that should not cause data races, even though they are not const. The list includes find and at, but not operator[].

Therefore you need to use find or at instead of operator[]. A particular implementation may give stronger guarantees and that may be likely if map[i] is not inserting a new element, but you would need to check that with your compiler/standard library documentation.

Aside from that, access, even modifying, to different elements of a container is always fine (except for vector<bool>), see the next paragraph in the standard.

walnut
  • 21,629
  • 4
  • 23
  • 59
  • Since this code is running on various machines (amd64, ppc) with various compilers (GCC, LLVM, Intel C++), it likely is better to use `at` instead of `operator[]`. The additional range checking should not be that expensive and actually help to fortify the code against dumb mistakes. – Martin Ueding Nov 04 '19 at 14:39
  • @MartinUeding I don't know how openmp handles exceptions uncaught in a parallel block, so you might want to read up on that first. My guess is that it is simply not allowed leading to a call to std::terminate or UB. I suppose you need to add a `catch` inside the parallel block to be caught by the throwing thread. – walnut Nov 04 '19 at 16:37
2

EDIT: There is no standard guarantee it will work properly since operator[] is not guaranteed to not modify the structure. Instead at() or find() would be better choices.

As far as I understand C++ standard and OpenMP docs - this is safe. Firstly, as long as you don't make operations modifying iterators the parallel modification should be fine.

Second question is if the data written in different threads will be visible in other threads. Luckily OpenMP has pretty good documentation which states that memory sync happens implicitly:

At exit from the task region of each implicit task;

bartop
  • 9,971
  • 1
  • 23
  • 54