4

I'm trying to reduce the amount of locking my code needs to do, and came across a bit of an academic question on how pthread_mutex_lock treats its memory barriers. To make this easy to understand, let's say the mutex is protecting a data-field that is totally static once initialized, but I want to defer this setup until the first access. The code I want to write looks like:

/* assume the code safely sets data to null at setup,
 * and the mutex is correctly setup 
 */
if (NULL == data) {
    pthread_mutex_lock(&lock);
    /* Need to re-check data in case it was already setup */
    if (NULL == data) 
        data = deferred_setup_fcn();
    pthread_mutex_unlock(&lock);
}

The possible issue I see is that data is setup inside the lock, but is read outside the lock. Is it possible for the compiler to cache the value of data across the mutex lock call? Or do I have to insert the appropriate volatile keywords to prevent that?

I know that it'd be possible to do this with a pthread_once call, but I wanted to avoid using another data-field (the lock was already there protecting related fields).

A pointer to a definitive guide on POSIX threads function call memory orderings would work great too.

J Teller
  • 1,421
  • 1
  • 11
  • 14
  • 1
    Using one more variable for `pthread_once(3)` seems a small price to pay for getting an already-debugged routine to perform exactly this sort of task. :) – sarnold May 23 '11 at 22:08
  • 1
    Although it doesn't look like your example has any such shortcoming, beware the possible pitfalls of double-checked lock pattern: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf – Adam Holmberg May 23 '11 at 22:09
  • @sarnold: pthread_once definitely works -- and I'm definitely reinventing the wheel here. Like I said though, this is a bit of an academic question. – J Teller May 24 '11 at 00:03
  • @Adam: Looks like a really good paper that might tell me exactly why what I'm doing won't work. I'll read through it quickly and see ... – J Teller May 24 '11 at 00:06
  • 1
    @sarnold: `pthread_once` has the unfortunate property that, if the initialization function can fail due to a temporary failure condition, there is no way to "retry initialization" on the next call. So sometimes it's worth rolling your own. – R.. GitHub STOP HELPING ICE May 24 '11 at 02:12
  • @R.., never thought of that. Thanks! :) – sarnold May 24 '11 at 02:13

3 Answers3

5

The problem with this pattern is that memory barriers are between two threads, but a reader in your example may execute no instructions that imply a barrier.

Thus there is no guarantee that memory writes performed by deferred_setup_fcn() are visible even if the write to data is visible (from the point of view of a reader that races with a writer). That is, the reader could see data != NULL, but when it actually tries to access the values pointed to by data, find a half-initialised or uninitialised structure.

caf
  • 233,326
  • 40
  • 323
  • 462
  • I think this is the right answer -- for this pattern to work, I'd need to put ADDITIONAL memory barriers around the locks and the read to make sure this works correctly ... let me muse on this and read a bit more before I mark a right answer. – J Teller May 25 '11 at 00:36
  • 1
    @J Teller: For most memory consistency models, you'd have to put a write memory barrier between the call to `deferred_setup_fcn()` and actually overwriting `data`. For extremely weak consistency models, like the Alpha, I think you'd also have to put a read memory barrier after testing `data`. – caf May 25 '11 at 01:15
4

The compiler is allowed to cache values in certain cases, but one of the places it must consider a barrier is a function call. Therefore the pthread_mutex_lock call should suffice to make it refetch data for the second test. Unfortunately I haven't found a proper reference, but the question has come up before:

Is function call a memory barrier?

Does guarding a variable with a pthread mutex guarantee it's also not cached?

The latter seems to refine the answer a bit: Function calls in general do not give this guarantee, but pthread_mutex_lock does.

Community
  • 1
  • 1
Yann Vernier
  • 15,414
  • 2
  • 28
  • 26
  • These two SO questions are good conversations, but have some issues: the 1st thread seems to be incorrect (I'm commenting on it right now to try to correct the record, so to speak). The second one definitely works if all accesses are protected by the pthread lock. If all reads and writes to the shared variable are protected with the lock, then pthread_mutex_lock is a memory barrier – J Teller May 24 '11 at 00:02
3

According to Hans Boehm's Reordering Constraints for Pthread-style Locks (page 14), as of 2006, NPTL in Alpha and PowerPC could reorder memory accesses before pthread_mutex_lock() in program order across the lock's memory barrier.

In another SO question, R.. argues that POSIX requires full memory barriers, but real-world implementations seem to not always do that.

Community
  • 1
  • 1
ninjalj
  • 42,493
  • 9
  • 106
  • 148