What is the right way to write double-checked locking in Rust?

Question

I found this article, but it looks wrong because Cell does not guarantee a synchronization between the set() under a lock and the get() over the lock.

Does Atomic_.store(true, Ordering::Release) affect other non-atomic write operations?

I tried to write it with AtomicPtr which looks close to Java-style but it failed. I couldn't find examples of the correct using of AtomicPtr in such cases.

Your question is a reasonable one, but it sounds like [an X-Y problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). You may wish to investigate components like [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html) and [lazy-static](https://crates.io/crates/lazy-static) demonstrated [here](https://stackoverflow.com/q/27791532/155423) and [here](https://stackoverflow.com/q/27221504/155423). — Shepmaster, Aug 14 '17 at 20:10

score 15 · Accepted Answer · edited Nov 02 '17 at 03:54

15

Does Atomic_.store(true, Ordering::Release) affect other non-atomic write operations?

Yes.

Actually, the primary reason Ordering exists is to impose some ordering guarantees on non-atomic reads and writes:

within a same thread of execution, for both compiler and CPU,
so that other threads have guarantees in the order in which they will see the changes.

Relaxed

The less constraining Ordering; the only operations which cannot be reordered are operations on the same atomic value:

atomic.set(4, Ordering::Relaxed);
other = 8;
println!("{}", atomic.get(Ordering::Relaxed));

is guaranteed to print 4. If another thread reads that atomic is 4, it has no guarantee about whether other is 8 or not.

Release/Acquire

Write and read barriers, respectively:

Release is to be used with store operations, and guarantees that prior writes are executed,
Acquire is to be used with load operations, and guarantees that further reads will see values at least as fresh as the ones written prior to the corresponding store.

So:

// thread 1
one = 1;
atomic.set(true, Ordering::Release);
two = 2;

// thread 2
while !atomic.get(Ordering::Acquire) {}

println!("{} {}", one, two);

guarantees that one is 1, and says nothing about two.

Note that a Relaxed store with an Acquire load or a Release store with a Relaxed load are essentially meaningless.

Note that Rust provides AcqRel: it behaves as Release for stores and Acquire for loads, so you don't have to remember which is which... I do not recommend it though, since the guarantees provided are so different.

SeqCst

The most constraining Ordering. Guarantees ordering across all threads at once.

What is the right way to write double-checked locking in Rust?

So, double-checked locking is about taking advantage of those atomic operations to avoid locking when unnecessary.

The idea is to have 3 pieces:

a flag, initially false, and true once the action has been executed,
a mutex, to guarantee exclusion during initialization,
a value, to be initialized.

And use them as such:

if the flag is true, value already initialized,
otherwise, lock the mutex,
if the flag still false: initialize and set the flag to true,
release the lock, the value is now initialized.

The difficulty is ensuring that the non-atomic reads/writes are correctly ordered (and become visible in the correct order). In theory, you would need full fences for that; in practice following the idioms of the C11/C++11 memory models will be sufficient since compilers must make it work.

Let's examine the code first (simplified):

struct Lazy<T> {
    initialized: AtomicBool,
    lock: Mutex<()>,
    value: UnsafeCell<Option<T>>,
}

impl<T> Lazy<T> {
    pub fn get_or_create<'a, F>(&'a self, f: F) -> &'a T
    where
        F: FnOnce() -> T
    {
        if !self.initialized.load(Ordering::Acquire) { // (1)
            let _lock = self.lock.lock().unwrap();

            if !self.initialized.load(Ordering::Relaxed) { // (2)
                let value = unsafe { &mut *self.value.get() };
                *value = Some(f(value));
                self.initialized.store(true, Ordering::Release); // (3)
            }
        }

        unsafe { &*self.value.get() }.as_ref().unwrap()
    }
}

There are 3 atomic operations, numbered via comments. We can now check which kind of guarantee on memory ordering each must provide for correctness.

(1) if true, a reference to the value is returned, which must reference valid memory. This requires that the writes to this memory be executed before the atomic turns true, and the reads of this memory be executed only after it is true. Thus (1) requires Acquire and (3) requires Release.

(2) on the other hand has no such constraint because locking a Mutex is equivalent to a full memory barrier: all writes are guaranteed to have occured before and all reads will only occur after. As such, there is no further guarantee needed for this load, so Relaxed is the most optimized.

Thus, as far as I am concerned, this implementation of double-checking looks correct in practice.

For further reading, I really recommend the article by Preshing which is linked in the piece you linked. It notably highlights the difference between the theory (fences) and practice (atomic loads/stores which are lowered to fences).

edited Nov 02 '17 at 03:54

Niklas

3,753
4
21
29

answered Aug 15 '17 at 11:59

Matthieu M.

287,565
48
449
722

Thank you for a lot. But I see one mistake (may be I'm wrong). Last read *self.value.get() see write operation in case when self.initialized.load(Ordering::Acquire) == true. For the reasons you mention. But in case if initialized == false, there is not the acquired load between the released store and non-atomic read *self.value.get(). In Java, for example, this is the reason that would make the value volatile (SeqCst). – Александр Меньшиков Aug 16 '17 at 17:00
@АлександрМеньшиков: I don't think it's a mistake. Note that you do not load with `Acquire` semantics only if `self.initialized` is true: you first load and then know whether it's true or false. Therefore, even if it eventually evaluates to `false`, you execute it with `Acquire` semantics which sequences the reads *after* evaluating it (and sequences the write before the store with `Release` semantics). – Matthieu M. Aug 16 '17 at 17:07
Yes, but after the store with `Release` semantics we don't make load with `Acquire` **in the same thread**, and just read the value in the race for return. And this read doesn't see writes because load with `Acquire` semantic was called before any writes. May be Rust have a guarantee for sequence all read and write in the same thread with shared memory -- then you are right. Just Java does not. And this is a reason for my skepticism. – Александр Меньшиков Aug 16 '17 at 17:26
At the C++ example (the article by Preshing), they store the value in a local variable for avoiding such problems. – Александр Меньшиков Aug 16 '17 at 17:38
@АлександрМеньшиков: First, do we agree that if `self.initialized.load(..)` evaluates to `true` then everything works fine? Now, in the case were `self.initialized.load(..)` evaluates to `false`... the atomic semantics are useless indeed. However, I think it doesn't matter because *mutex*. AFAIK, when you use a mutex to protect some changes, you have a guarantee that the next thread to obtain the lock will see the changes; in short, locking the mutex is at least as good as a new `Acquire` call and unlocking it is at least as good as a new `Release` call. – Matthieu M. Aug 16 '17 at 18:06
Yes. If we have read `value` immediately after load with `Acquire` everything works fine. And I can believe if we see the result of `store` under the lock in `load` with `Relaxed` everything will fine because mutex synchronizes `load` and `store` under the lock. But the thread which initializes the `value` (both `initialized.load` return `false`) might not see its own `store` when reading the` value` for return. In Java, the local order in one thread works only for local variables, but the `value` is shared. For avoiding this problem often use local temp copy for shared value. – Александр Меньшиков Aug 17 '17 at 09:43
@АлександрМеньшиков: This looks like a subtle issue. I've never heard of such an issue in C or C++; I think the C and C++ standards guarantee that in a single-thread a read will see previous (sequentials) writes without any need for synchronization and I am surprised that there could be an issue in Java around that. If you know of any article explaining this in more details, I'd be happy to understand it, because as it is it just seems really weird to me. – Matthieu M. Aug 17 '17 at 11:07
Hm... I have reread the [post](https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/) (search for text "WARNING: Second read") of Aleksey Shipilev who is OpenJDK developer. And looks like this is little bit different situation. My apologies. – Александр Меньшиков Aug 17 '17 at 14:58
@АлександрМеньшиков: Nice article, thanks for the link. I don't think this applies to C++ (or Rust) because it seems to be relying specifically on the semantics of `volatile` in Java. In C++ or Rust, `volatile` works differently (and should not be used for multi-threading), so you have to use atomics or fences. So, with all that, I think that the situation here is fine. (btw, for a reading of ordering, I just realized that LLVM has a pretty nice explanation of each: http://llvm.org/docs/Atomics.html#monotonic) – Matthieu M. Aug 17 '17 at 15:27
How you analyze the lifetime of `_lock` here? – MrZ May 27 '23 at 06:49
@MrZ: Destructors are executed at the end of the lexical scope, unless the object is moved, and are executed in reverse order of creation. Thus `_lock` lives until after the `if` block starting at (2) is completely executed (whether the branch is taken or not), at which point `_lock` is dropped and `self.lock` is unlocked. – Matthieu M. May 27 '23 at 10:31
@MatthieuM. How about "last use" rule? `_lock is a variable that never be used. – MrZ May 28 '23 at 12:07
1

@MrZ: You're confusing lifetime of variables and duration of borrowing. A variable's destructor is _always_ executed at the end of its lexical scope (unless moved). A variable which borrows another does so until the last use of said variable. There _is_ overlap between the two concepts, though: dropping a variable at the end of the scope _is_ the last use of the variable when doing so invokes `Drop::drop` -- in opposition to "dropping" a reference (`&T` or `&mut T`) which is a no-op and doesn't count. So here? `_lock` is dropped at the end of the scope, and this counts as its last use. – Matthieu M. May 28 '23 at 12:58

What is the right way to write double-checked locking in Rust?

1 Answers1