2

I want to dump the keys of an unordered_map while being able to simultaneously add and erase elements. It takes 4 seconds to dump entirely, it's too long. Is it possible to dump in separate thread, like this:

while (1) {
    pthread_mutex_lock( &mutex ); 
    if(iter!=map.end()){
        x=iter->first
        iter++;    
    }
    pthread_mutex_unlock( &mutex );

    do_this(x);  // this takes time to complete
}

while in the main thread I have:

pthread_mutex_lock( &mutex ); 
map.erase(iter);

Does the erase method of unordered map make problem, since the iterator will be invalid after erasing.

Is there any other safe way to dump in parallel?

Yasser
  • 376
  • 5
  • 13

3 Answers3

3

For unordered_map (and associative containers in general), the erase() member function does not invalidate iterators and references to other elements than the removed one(s).

However, here you may be erasing an element and invalidate iterators to it while your loop holds an iterator to that element: for instance, if you happen erase the element which is referenced by the next iterator that is going to be dereferenced in your loop.

Hence, you need to take care that the element you are removing is not referenced by the iterator you are going to process in the next loop of your while cycle:

pthread_mutex_lock( &mutex ); 
if (i != iter)
{
    map.erase(i);
}
else
{
    // Maybe store in a queue of elements to be removed after the loop is done
}

Where iter is the iterator variable used in the loop.

Andy Prowl
  • 124,023
  • 23
  • 387
  • 451
  • What if iter is erased by the other thread during the unsynced do_this() call - it will be invalidated, and the next increment will cause UB. – jmetcalfe Feb 03 '13 at 14:43
  • @jmetcalfe: I assumed from the question's text that `do_this()` processes the element but does not modify the map. Now I'm a bit confused though, as to *where* the `do_this()` is supposed to be – Andy Prowl Feb 03 '13 at 14:46
  • I don't think do_this() matters, just the fact that you are holding an iterator while not holding the mutex, so the other thread can come along and erase that iter. – jmetcalfe Feb 03 '13 at 14:51
  • @jmetcalfe: Well, `do_this()` is not holding an iterator, but an element. My assumption is that `do_this()` does not work with the map, so it won't hold iterators. – Andy Prowl Feb 03 '13 at 14:53
  • The loop around do_this() is still holding the iterator, and incrementing it in the next iteration – jmetcalfe Feb 03 '13 at 14:55
  • @jmetcalfe: `do_this()` does not hold an iterator, and the loop before it just loops through the end. Which is what makes me wonder as to where `do_this()` is actually *meant* to be. The loop can't interleave with `erase()` because of the mutex, only `do_this()` can. But then again, `do_this()` does not hold an iterator. – Andy Prowl Feb 03 '13 at 14:57
  • do_this doesn't hold an iterator, but the loop around it does. It holds it while do_this is called. if the iterator is invalidated at this point, do_this will succeed without issue, but in the next loop iteration the held iterator will be invalid. The loop can interleave because do_this is within the loop – jmetcalfe Feb 03 '13 at 15:01
  • @jmetcalfe: I don't know why, but for some reason I was reading the `if` as a `while`. Let me re-read it. – Andy Prowl Feb 03 '13 at 15:09
  • It's a bit confusing (and loops infinitely in it's current form...) :) – jmetcalfe Feb 03 '13 at 15:10
  • @jmetcalfe: right, the iterator could be invalidated. I will edit my answer – Andy Prowl Feb 03 '13 at 15:16
  • +1 for the new solution. Good idea. Only thing to add is you need to make sure that type of x (and the key) is simple i.e. not a reference, pointer or similar as that is obviously also invalidated. – jmetcalfe Feb 03 '13 at 15:28
1

See: What happens if you call erase() on a map element while iterating from begin to end?

Since you increment the iterator before calling the do_this method which calls erase it will not cause any troubles.

Just a thought: With your current algorithm I don't think that you need the mutex at all.

Community
  • 1
  • 1
bikeshedder
  • 7,337
  • 1
  • 23
  • 29
  • 1
    My understanding was that it was a separate thread calling erase, not the do_this call, which does the dumping? – jmetcalfe Feb 03 '13 at 14:43
1

You could get some (but not all - this really just allows you to interleave the operations so that the erase does not have to wait the entire 4 secs) of the desired parellelism by iterating buckets rather than iterating elements. This would be safe as long as the bucket count is not reduced

i.e.

pthread_mutex_lock( &mutex ); 
size_t count = map.bucket_count();
pthread_mutex_unlock( &mutex );

for(size_t i = 0; i<count; ++i){
  pthread_mutex_lock( &mutex );
  for(auto it = map.begin(i); it != map.end(i); ++i)
    do_this(it->first);
  pthread_mutex_unlock( &mutex );
}

if you wanted to pull the do_this out of the mutex you would need to accumulate the values in some other structure

Another suggestion, depending on exactly how this map is used elsewhere, is that you could just swap the element to some known invalid value instead of erasing, then have the thread which is doing the dumping/do_this do the actual erase when it sees this value.

jmetcalfe
  • 1,296
  • 9
  • 17