I have a multi-threaded application that creates 48 threads that all need to access a common attribute (stl::map). The map will only be written to when the threads start, and the rest of the time the map will be read from. This seems like the perfect use-case for a pthread_rw_lock, and all appears to be working well.
I ran across a completely unrelated seg-fault and started analyzing the core. Using gdb, I executed the command info threads
and was quite surprised at the results. I observed that several threads were actually reading from the map as expected, but the strange part is that several threads were blocked in pthread_rwlock_rdlock() waiting on the rw_lock.
Here is the stack trace for a thread that is waiting on the lock:
#0 0xffffe430 in __kernel_vsyscall ()
#1 0xf76fe159 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0xf76fab5d in pthread_rwlock_rdlock () from /lib/libpthread.so.0
#3 0x0804a81a in DiameterServiceSingleton::getDiameterService(void*) ()
With so many threads, its difficult to say how many were reading and how many were blocked, but I dont understand why any threads would be blocked waiting to read, considering other threads are already reading.
So here is my question: Why are some threads blocked waiting to read a rw_lock, when other threads are already reading from it? It appears as though there is a limit to the number of threads that can simultaneously read.
Ive looked at the pthread_rwlock_attr_t
functions and didnt see anything related.
The OS is Linux, SUSE 11.
Here is the related code:
{
pthread_rwlock_init(&serviceMapRwLock_, NULL);
}
// This method is called for each request processed by the threads
Service *ServiceSingleton::getService(void *serviceId)
{
pthread_rwlock_rdlock(&serviceMapRwLock_);
ServiceMapType::const_iterator iter = serviceMap_.find(serviceId);
bool notFound(iter == serviceMap_.end());
pthread_rwlock_unlock(&serviceMapRwLock_);
if(notFound)
{
return NULL;
}
return iter->second;
}
// This method is only called when the app is starting
void ServiceSingleton::addService(void *serviceId, Service *service)
{
pthread_rwlock_wrlock(&serviceMapRwLock_);
serviceMap_[serviceId] = service;
pthread_rwlock_unlock(&serviceMapRwLock_);
}
Update:
As mentioned in the comments by MarkB, if I had set pthread_rwlockattr_getkind_np() to give priority to writers, and there is a writer blocked waiting, then the observed behavior would make sense. But, Im using the default value which I believe is to give priority to readers. I just verified that there are no threads blocked waiting to write. I also update the code as suggested by @Shahbaz in the comments and get the same results.