4

I have a block of code that needs to run fast, right now I'm using pthread_mutex_lock/pthread_mutex_unlock to sync the threads but I saw that it has a certain impact on performance. I was wondering, if anyone ever benchmarked this, is sem_post/sem_wait significantly faster than pthread_mutex_lock/pthread_mutex_unlock?

Thanks!

Rachid K.
  • 4,490
  • 3
  • 11
  • 30
Rad'Val
  • 8,895
  • 9
  • 62
  • 92
  • I found this based on @Giovanni answer, [link](http://stackoverflow.com/questions/195853/spinlock-versus-semaphore) Seems to put some light into it. – Rad'Val Apr 14 '11 at 13:32
  • 5
    IMO, you're looking in the wrong place. If locking primitives are taking enough time to notice, you need to concentrate on a design with less locking, not on a faster locking primitive. Unfortunately, a lot of homework (and such) deliberately uses really terrible designs so you have to do synchronization a lot to help teach about synchronization. In reality, synchronization should be relatively unusual. – Jerry Coffin Apr 14 '11 at 13:40

4 Answers4

2

No, it is not significantly faster. They are implemented using same lower level primitives (read spin-locks and system calls). The real answer though would only be comparing both in your particular situation.

Nikolai Fetissov
  • 82,306
  • 11
  • 110
  • 171
2

I'd say a semaphore is probably slower than a mutex because a semaphore has a superset of the mutex behavior. You can try something at user level such as spin lock that runs without kernel support, but it all depends on the rate of lock/unlocks and the contention.

Giovanni Funchal
  • 8,934
  • 13
  • 61
  • 110
  • "it has a superset of its behavior" could you please explain?:D – Rad'Val Apr 14 '11 at 13:25
  • A mutex only allow 1 thread to be "inside". It only needs to test if someone is already there. A semaphore allows N threads to be "inside" (N can be =1). It must test how many post have been done. – Giovanni Funchal Apr 14 '11 at 13:37
  • mm..I see, then I think it makes sense to use a mutex, probably my problem it's elsewhere, maybe I have to find a way NOT to take and put the lock so often, I'll try to find other ways of optimizing then...Thanks – Rad'Val Apr 14 '11 at 13:43
  • 2
    Try using profiling to identify precisely the source of the problem. – Giovanni Funchal Apr 14 '11 at 13:44
  • @Giovanni: semaphores (in general) don't implement the concept of "owner", so I'd rather say that they do less then mutexes. If you use a semaphore instead of a mutex, it's on you to implement the "ownership". – Giuseppe Guerrini Apr 14 '11 at 15:53
  • Regular mutexes don't have a concept of owner either, at least not in a testable way (any operation that would test the owner has undefined behavior). Only recursive and error-checking mutexes have testable owners. – R.. GitHub STOP HELPING ICE Apr 14 '11 at 16:22
2

I would expect them to be roughly the same speed, but you could always benchmark it yourself if you really care. With that said, POSIX semaphores do have one, and as far as I'm concerned only one, advantage over the more sophisticated primitives like mutexes and condition variables: sem_post is required to be async-signal-safe. It is the only synchronization-related function which is async-signal-safe, and it makes it possible to perform minimal interaction between threads from a signal handler! - something which would otherwise be impossible without much heavier tools like pipes or SysV IPC which don't interact well with the performance-oriented pthread idioms.

Edit: For reference, the simplest implementation of pthread_mutex_trylock:

if (mutex->type==PTHREAD_MUTEX_DEFAULT) return atomic_swap(mutex->lock, EBUSY);
else /* lots of stuff to do */

and the simplest implementation of sem_trywait:

int val = sem->val;
return (val>0 && atomic_compare_and_swap(sem->val, val, val-1)==val) ? 0 : EAGAIN;

Assuming an optimal implementation, I would guess that a mutex lock might be slightly faster, but again, benchmark it.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
1

If you're using Objective C your environment may be close enough to Cocoa to be able to use Grand Central Dispatch which would probably be even faster and definitely be even easier

Nektarios
  • 10,173
  • 8
  • 63
  • 93