6

I have a multi-threaded application that creates 48 threads that all need to access a common attribute (stl::map). The map will only be written to when the threads start, and the rest of the time the map will be read from. This seems like the perfect use-case for a pthread_rw_lock, and all appears to be working well.

I ran across a completely unrelated seg-fault and started analyzing the core. Using gdb, I executed the command info threads and was quite surprised at the results. I observed that several threads were actually reading from the map as expected, but the strange part is that several threads were blocked in pthread_rwlock_rdlock() waiting on the rw_lock.

Here is the stack trace for a thread that is waiting on the lock:

#0  0xffffe430 in __kernel_vsyscall ()
#1  0xf76fe159 in __lll_lock_wait () from /lib/libpthread.so.0
#2  0xf76fab5d in pthread_rwlock_rdlock () from /lib/libpthread.so.0
#3  0x0804a81a in DiameterServiceSingleton::getDiameterService(void*) ()

With so many threads, its difficult to say how many were reading and how many were blocked, but I dont understand why any threads would be blocked waiting to read, considering other threads are already reading.

So here is my question: Why are some threads blocked waiting to read a rw_lock, when other threads are already reading from it? It appears as though there is a limit to the number of threads that can simultaneously read.

Ive looked at the pthread_rwlock_attr_t functions and didnt see anything related.

The OS is Linux, SUSE 11.

Here is the related code:

{
  pthread_rwlock_init(&serviceMapRwLock_, NULL);
}

// This method is called for each request processed by the threads
Service *ServiceSingleton::getService(void *serviceId)
{
  pthread_rwlock_rdlock(&serviceMapRwLock_);
    ServiceMapType::const_iterator iter = serviceMap_.find(serviceId);
    bool notFound(iter == serviceMap_.end());
  pthread_rwlock_unlock(&serviceMapRwLock_);

  if(notFound)
  {
    return NULL;
  }

  return iter->second;
}

// This method is only called when the app is starting
void ServiceSingleton::addService(void *serviceId, Service *service)
{
  pthread_rwlock_wrlock(&serviceMapRwLock_);
    serviceMap_[serviceId] = service;
  pthread_rwlock_unlock(&serviceMapRwLock_);
}

Update:

As mentioned in the comments by MarkB, if I had set pthread_rwlockattr_getkind_np() to give priority to writers, and there is a writer blocked waiting, then the observed behavior would make sense. But, Im using the default value which I believe is to give priority to readers. I just verified that there are no threads blocked waiting to write. I also update the code as suggested by @Shahbaz in the comments and get the same results.

Brady
  • 10,207
  • 2
  • 20
  • 59
  • 3
    Are you *sure* that there wasn't a writer lock blocking as well? – Mark B Aug 08 '12 at 14:20
  • @MarkB That's an excellent question! But doesnt that depend on pthread_rwlockattr_getkind_np() which I havent called? Im not sure if any threads are waiting to write, but they shouldnt be since that should ONLY happen at the beginning. I'll have to check though. – Brady Aug 08 '12 at 14:22
  • @MarkB, what affect would that have if a writer is waiting and I hadnt set pthread_rwlockattr_getkind_np()? As I understand it, the writer may be starved if there are continuous readers, right? – Brady Aug 08 '12 at 14:27
  • From the `pthread_rwlock_rdlock` man page: `The pthread_rwlock_rdlock() function applies a read lock to the read-write lock referenced by rwlock. The calling thread acquires the read lock if a writer does not hold the lock *and there are no writers blocked on the lock*.` (emphasis mine). – Mark B Aug 08 '12 at 14:30
  • @MarkB, ok thanks. I thought the behavior mentioned in the last part was only achieved by pthread_rwlockattr_getkind_np(). If not, then what does this setting do? – Brady Aug 08 '12 at 14:34
  • It looks like the behavior I quoted is only for Solaris. On Linux it does default to writer starvation unless you set the attribute to allow blocked writers to block read locks as well. In that case I have no idea what's happening. – Mark B Aug 08 '12 at 14:38
  • Side note: Is `serviceMap_.end()` guaranteed to be constant even if the map changes? (I never had to look something like that up from the standard). If not, then you have to put the `if` inside the critical section also. – Shahbaz Aug 08 '12 at 15:06
  • @Shahbaz, I thought about that when I was posting this question. That's a good question, that I'll have to look into. Considering that the writes are only at startup, and I havent seen any strange behavior, I dont think its a problem. – Brady Aug 08 '12 at 15:13
  • @brady, well, since [end() is decrementable](http://stackoverflow.com/a/5322234/912144), I would say it could change if the map is changed. Better be safe and place that check in the critical section. Or most probably set a bool equal to that condition inside the critical section and use it on the outside. – Shahbaz Aug 08 '12 at 15:21
  • @Shahbaz, Thanks, I already update the code to set a bool inside the critical section and will test it soon. – Brady Aug 08 '12 at 15:22
  • @Shahbaz, I update the code and tested again and get the same results. – Brady Aug 08 '12 at 16:01

2 Answers2

6

You merely observed the inherent performance issues involved with acquiring locks. It takes some time, and you just happened to catch those threads in the middle of it. This is particularly true when the operation protected by the lock has a very short duration.

Edit: Reading the source, glibc uses lll_lock to protect critical sections within its own pthread library data structures. The pthread_rwlock_rdlock checks several flags and increments counters, so it does those things while holding a lock. Once those are done, the lock is released with lll_unlock.

To demonstrate, I implemented a short routine that sleeps after acquiring the rwlock. The main thread waits for them to finish. But before waiting, it prints the concurrency achieved by the threads.

enum { CONC = 50 };

pthread_rwlock_t rwlock;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
unsigned count;

void *routine(void *arg)
{
    int *fds = static_cast<int *>(arg);
    pthread_rwlock_rdlock(&rwlock);
    pthread_mutex_lock(&mutex);
    ++count;
    if (count == CONC) pthread_cond_signal(&cond);
    pthread_mutex_unlock(&mutex);
    sleep(5);
    pthread_rwlock_unlock(&rwlock);
    pthread_t self = pthread_self();
    write(fds[1], &self, sizeof(self));
    return 0;
}

And the main thread waits for the counter to reach 50:

int main()
{
    int fds[2];
    pipe(fds);
    pthread_rwlock_init(&rwlock, 0);
    pthread_mutex_lock(&mutex);
    for (int i = 0; i < CONC; i++) {
        pthread_t tid;
        pthread_create(&tid, NULL, routine, fds);
    }
    while (count < CONC) pthread_cond_wait(&cond, &mutex);
    pthread_mutex_unlock(&mutex);
    std::cout << "count: " << count << std::endl;
    for (int i = 0; i < CONC; i++) {
        pthread_t tid;
        read(fds[0], &tid, sizeof(tid));
        pthread_join(tid, 0);
    }
    pthread_rwlock_destroy(&rwlock);
    pthread_exit(0);
}

Edit: Simplified the example using C++11 thread support:

enum { CONC = 1000 };
std::vector<std::thread> threads;

pthread_rwlock_t rwlock;
std::mutex mutex;
std::condition_variable cond;
unsigned count;

void *routine(int self)
{
    pthread_rwlock_rdlock(&rwlock);
    { std::unique_lock<std::mutex> lk(mutex);
      if (++count == CONC) cond.notify_one(); }
    sleep(5);
    pthread_rwlock_unlock(&rwlock);
    return 0;
}

int main()
{
    pthread_rwlock_init(&rwlock, 0);
    { std::unique_lock<std::mutex> lk(mutex);
      for (int i = 0; i < CONC; i++) {
          threads.push_back(std::thread(routine, i));
      }
      cond.wait(lk, [](){return count == CONC;}); }
    std::cout << "count: " << count << std::endl;
    for (int i = 0; i < CONC; i++) {
        threads[i].join();
    }
    pthread_rwlock_destroy(&rwlock);
    pthread_exit(0);
}
jxh
  • 69,070
  • 8
  • 110
  • 193
  • I update my question with the stack trace of a thread waiting on a read lock. Do you mean that even though the thread is in `__lll_lock_wait()` that its not actually waiting/blocked, but its in the function to get a read lock? If so, that's an unfortunate function name, wouldnt be the first though :) – Brady Aug 08 '12 at 15:42
  • @Yes. I don't have the sources in front of me to confirm, but I suspect this corresponds to the system call entry point for `glibc` to get to the kernel. – jxh Aug 08 '12 at 15:57
  • This would make sense. Not that I dont believe you :) but Im looking for the source code to verify it. I'll let you know when I find it, thanks! – Brady Aug 08 '12 at 16:28
  • @Brady: I scanned the source, and updated the answer with what I found. Regards – jxh Aug 08 '12 at 16:41
  • 1
    Great answer. Nice research. – Nemo Aug 08 '12 at 16:51
3

As a side note, the code posted above is broken. You can't access iter->second out of the rw_lock'd section, because as soon as you unlock the rw_lock, a writer can remove any element in the map, thus invalidating any iterator on it.

I know you're not doing this in your case since you don't write anything past the beginning of the program execution, but still, it's worth mentionning.

Also, as a side note, since the behaviour your describe seems like it's serialized (writers write to the map at the begining, then the readers read a "read-only" map from now), you probably should write it like this:

int writerDoneWithMap = 0;
// pthread_cond & mutex init here

// The writer write to the map here 

// Then they signal the reader that they are done with it
while (!__sync_bool_compare_and_swap(&writerDoneWithMap, 1, writerDoneWithMap));
pthread_cond_broadcast here


// The readers will then simply do this:
while (!writerDoneWithMap) 
{
     // pthread_cond_wait here
}
// Read the data without locks.

The code above avoid any locking on the readers if the writer have finished filling the map, and in case they have not, then you resort to typical pthread_cond/mutex technics. The code above is correct if and only if you have writer THEN readers (but that's what you said), else it will fail.

xryl669
  • 3,376
  • 24
  • 47