9

Maybe I'm misunderstanding about std::mutex::try_lock:

This function is allowed to fail spuriously and return false even if the mutex is not currently locked by any other thread.

This means that if no one thread has a lock on that mutex, when I try a try_lock it could return false? For what purpose?

Isn't the function of try_lock return false if its locked OR true if nobody lock it? Not really sure if my non-native english is fooling me...

markzzz
  • 47,390
  • 120
  • 299
  • 507
  • 4
    It's for the same reason that std::condition_variable::wait is allowed to spuriously awake, even if the condition hasn't been set - it allows the OS to optimise some common cases more – UKMonkey Jan 17 '18 at 15:49
  • @YSC I don't really have time to write a full blown answer :( but the condition_variable one is a question I know is around on SO somewhere... – UKMonkey Jan 17 '18 at 15:54

5 Answers5

8

This means that if no one thread has a lock of that mutex, when I try a try_lock, it could return false?

Yes, that's exactly what it says.

Isn't the function of try_lock return false if its locked OR true if nobody lock it?

No, the function of try_lock is to try to lock the mutex.

However, there is more than one way it can fail:

  1. the mutex is already locked elsewhere (this is the one you're thinking of)
  2. some platform-specific feature interrupts or prevents the locking attempt, and control is returned to the caller who can decide whether to retry.

The common case on POSIX-ish platforms, and inherited from POSIX threads, is that a signal is delivered to (and handled by a signal handler in) the current thread, interrupting the lock attempt.

There may be other platform-specific reasons on other platforms, but the behaviour is the same.

Useless
  • 64,155
  • 6
  • 88
  • 132
  • This means that part of my code "randomly" will/won't be executed, even if no other threads is using the lock :O – markzzz Jan 17 '18 at 15:54
  • 8
    No, it means that your code is _statically incorrect and poorly-written_ if it isn't designed to cope with this entirely normal situation. – Useless Jan 17 '18 at 15:55
  • 1
    @markzzz this is why you test the result of try, rather than assume you have it. – UKMonkey Jan 17 '18 at 15:55
  • Yeah, if you just want to block until you get the lock, do it in a loop. Obviously you could just call the regular non-try variant in that case - maybe it would be better if `try_lock` distinguished between the different failure cases ... – Useless Jan 17 '18 at 15:57
  • 1
    I need a solution that DO somethings if nobody has a lock. Else GO AHEAD (i.e. not locking the thread, simply ignoring the if statement) otherwise. `try-lock` seems not the right way. What do you suggest? – markzzz Jan 17 '18 at 15:59
  • 1
    depends who else is contending for the lock, and what it protects, but it sounds like it deserves it's own question – Useless Jan 17 '18 at 16:03
  • 2
    @markzzz: Might be worth stepping back and revisiting the root problem that you're trying to solve, because this seems like a weak strategy to solve it. Focus less on the one approach you think is the right one, and more on what some other approaches may be. Making program behaviour depend on timings and thread interactions like this just sounds like a giant abstraction leak to me. What are the threads supposed to do? – Lightness Races in Orbit Feb 12 '18 at 00:35
2

Based on your comments, I would write (quoting your words):

std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex
...
if (lock.try_lock()) {
  ... // "DO something if nobody has a lock"
} else {
  ... // "GO AHEAD"
}

Note that lock.try_lock() effectively calls m.try_lock(), therefore it is prone to spurious fail as well. But I wouldn't care much about this issue. IMO, in practice, spurious fails/wakeups are quite rare (as Useless pointed out, on Linux, they can happen when a signal is delivered).

More about spurious issues, see e.g.: https://en.wikipedia.org/wiki/Spurious_wakeup or Why does pthread_cond_wait have spurious wakeups?.

UPDATE

If you really want to eliminate spurious fail of try_lock, you can use some atomic flag such as:

// shared by threads:
std::mutex m;  
std::atomic<bool> flag{false};

// within threads:
std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex
...
while (true) {
  lock.try_lock();
  if (lock.owns_lock()) {
    flag = true;
    ... // "DO something if nobody has a lock"    
    flag = false;
    break;
  } else if (flag == true) {
    ... // "GO AHEAD"
    break;
  }
}

It may be possibly rewritten to better form, I didn't check. Also, note that flag is not automatically unset via RAII, some scope guard may be useful here.

UPDATE 2

If you do not need also the blocking functionality of mutex, use std::atomic_flag:

std::atomic_flag lock = ATOMIC_FLAG_INIT;

// within threads:
if (lock.test_and_set()) {
    ... // "DO something if nobody has a lock"    
    lock.clear();
} else {
    ... // "GO AHEAD"
}

Just, again, clearing the flag would be better via some RAII mechanism.

Daniel Langr
  • 22,196
  • 3
  • 50
  • 93
  • I don't see any of your code, but you asked: _"I need a solution that DO somethings if nobody has a lock. Else GO AHEAD (i.e. not locking the thread, simply ignoring the if statement) otherwise. try-lock seems not the right way. What do you suggest?"_. This is what I suggest. Moreover, it uses `std::unique_lock` which is better than using mutex directly, since it support guaranteed unlocking via RAII technique. But the main message is that basically I would't care at all about spurious fail in this case. – Daniel Langr Jan 17 '18 at 16:31
  • @markzzz I updated my answer to handle spurious fails. – Daniel Langr Jan 17 '18 at 16:44
1

Unlike was said there, I don't think there is any reason for a try_lock function to failed due to OS-related reasons: such operation is non-blocking, so signals cannot really interrupt it. Most likely it has everything to do with how this function is implemented on CPU-level. After all, uncontested case is usually the most interesting one for a mutex.

Mutex locking usually requires some form of atomic compare exchange operation. C++11 and C11 introduce atomic_compare_exchange_strong and atomic_compare_exchange_weak. The latter is allowed to fail spuriously.

By allowing try_lock to fail spuriously, implementations are allowed to use atomic_compare_exchange_weak to maximize performance and minimize code size.

For example on ARM64 atomic operations are usually implemented using exclusive-load (LDXR) and exclusive-store (STRX) instructions. LDXR fires up "monitor" hardware which starts tracking all accesses to a memory region. STRX only performs the store if no accesses to that region were made between LDXR and STRX instructions. Thus the whole sequence can fail spuriously if another thread accesses that memory region or if there was an IRQ in between the two.

In practice, code generate for try_lock implemented using weak guarantee is not very different from the one implemented using strong guarantee.

bool mutex_trylock_weak(atomic_int *mtx)
{
    int old = 0;
    return atomic_compare_exchange_weak(mtx, &old, 1);
}

bool mutex_trylock_strong(atomic_int *mtx)
{
    int old = 0;
    return atomic_compare_exchange_strong(mtx, &old, 1);
}

Take a look at generated assembly for ARM64:

mutex_trylock_weak:
  sub sp, sp, #16
  mov w1, 0
  str wzr, [sp, 12]
  ldaxr w3, [x0]      ; exclusive load (acquire)
  cmp w3, w1
  bne .L3
  mov w2, 1
  stlxr w4, w2, [x0]  ; exclusive store (release)
  cmp w4, 0           ; the only difference is here
.L3:
  cset w0, eq
  add sp, sp, 16
  ret

mutex_trylock_strong:
  sub sp, sp, #16
  mov w1, 0
  mov w2, 1
  str wzr, [sp, 12]
.L8:
  ldaxr w3, [x0]      ; exclusive load (acquire)
  cmp w3, w1
  bne .L9
  stlxr w4, w2, [x0]  ; exclusive store (release)
  cbnz w4, .L8        ; the only difference is here
.L9:
  cset w0, eq
  add sp, sp, 16
  ret

The only difference is that "weak" version eliminates conditional backward branch cbnz w4, .L8 and replaces it with cmp w4, 0. Backward condition branches are predicted by CPU as "will-be-taken" in the absense of branch prediction information as they are assumed to be part of a loop - such assumption is wrong in this case as most of the time lock will be acquired (low contention is assumed to be the most common case).

Imo this is the only performance difference between those functions. "Strong" version can basically suffer from 100% branch misprediction ratio on single instruction under some workloads.

By the way, ARMv8.1 introduces atomic instructions, so there is no difference between the two, just like on x86_64. Code generated with -march=armv8.1-a flag:

  sub sp, sp, #16
  mov w1, 0
  mov w2, 1
  mov w3, w1
  str wzr, [sp, 12]
  casal w3, w2, [x0]
  cmp w3, w1
  cset w0, eq
  add sp, sp, 16
  ret

Some try_lock functions can fail even when atomic_compare_exchange_strong is used, for example try_lock_shared of shared_mutex might need to increment reader counter and might fail if another reader has entered the lock. "Strong" variant of such function would need to generate a loop and thus can suffer from a similar branch mispredication.

Another minor detail: if mutex is written in C, some compilers (like Clang) might align loop at 16-byte boundary to improve its performance, bloating function body with padding. This is unnecessary if loop almost always runs a single time.


Another reason for spurious failure is a failure to acquire internal mutex lock (if mutex is implemented using a spinlock and some kernel primitive). In theory the same principle could be acquired in the kernel implementation of try_lock, although this does not seems reasonable.

  • _"such operation is non-blocking, so signals cannot really interrupt it"_ Ehm unless the operation is totally atomic then of course they can. Every operation is "blocking", it's just a measure of degree – Lightness Races in Orbit Feb 12 '18 at 00:37
  • @LightnessRacesinOrbit If everything is blocking, then nothing is blocking. Any reasonable `try_lock` implementation in the kernel will only need to wait for few spinlocks. Those waits are not signal-interrupted. Even posix does not list `EINTR` for lock/timedlock functions. –  Feb 12 '18 at 00:41
  • I related only to your claim in the quoted passage, not to `try_lock` on POSIX systems specifically – Lightness Races in Orbit Feb 12 '18 at 00:42
  • @LightnessRacesinOrbit Ofc you can implement `try_lock` through some message passing mechanism built on top of a socket, which is very much interruptible (and this is what Wine _actually_ does), but I would consider such implementation "unreasonable". There is no reason to overcomplicate programming interface. Interface should be as minimal as possible, "unreasonable" implementation should hide any unnecessary details. –  Feb 12 '18 at 00:48
  • Unfortunately, what "should be" and what "must be" are not always in alignment. – Lightness Races in Orbit Feb 12 '18 at 01:18
  • @LightnessRacesinOrbit Now this is just a verbal masturbation. –  Feb 12 '18 at 01:23
  • No, not at all. They are completely different concepts. – Lightness Races in Orbit Feb 12 '18 at 09:28
1

In paper Foundations of the C++ Concurrency Memory Model Section 3, there is already a clear explanation for why the standard allows spurious failures of try_lock. In short, it is specified to make the semantics of try_lock be consistent with the definition of race in C++ memory model.

chain ro
  • 737
  • 2
  • 10
  • 21
0

If the call to try_lock() returns true the call succeeded in locking the lock. If it returns false if it did not. That's all. Yes, the function can return false when nobody else has the lock. False means only that the attempt to lock did not succeed; it does not tell you why it failed.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165