I'm maintaining some software that runs on Windows and several UNIX platforms: Mac, Linux, AIX and Solaris. It implements a threading infrastructure on top of pthreads or Win32 threads. I'm starting to implement rwlocks in this infrastructure so that our developers can use them. So far, so good.
On Mac OS X, we originally implemented threading using normal pthreads, but found that performance was very poor, because OS X pthreads mutexes always made system calls. Apple recommended that we use GCD dispatch semaphores, and this worked just fine, with a considerable performance improvement, because waiting for a semaphore is a simple userspace operation if the semaphore is free.
However, I can't see any way to do the equivalent of rwlocks, and it looks impossible in terms of a simple semaphore. Am I missing something, or is this actually impossible?
Note: switching everything to the GCD approach with queues and blocks is not feasible. The code has to work on platforms that don't have GCD, and re-writing all the usages of the threading infrastructure, in about 170 source files, would not be practical.