5

it make me confused, i reading golang memory model, https://golang.org/ref/mem

var l sync.Mutex
var a string

func f() {
    a = "hello, world"
    l.Unlock()
}

func main() {
    l.Lock()
    go f()
    l.Lock()
    print(a)
}

Mutex Lock Unlock by atomic

UnLock: new := atomic.AddInt32(&m.state, -mutexLocked)

Lock: atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) 

my question is, if atomic AddInt32, CompareAndSwapInt32 will cause memory barriers, if a will be visible in different goroutines.

In java, I know AtomicInteger, memory barriers by "volatile", keep thread field visible.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
yaoqijun
  • 53
  • 3

2 Answers2

4

Go doesn't have volatile equivalent. Atomic memory model is not well defined in Go, so to be super safe you should assume nothing, i.e. changes to a can be invisible. But in practice as I understood all architectures do a memory fence so you're safe.

There's a big issue about defining the behavior, there's a comment from Russ Cox saying

Yes, I spent a while on this last winter but didn't get a chance to write it up properly yet. The short version is that I'm fairly certain the rules will be that Go's atomics guarantee sequential consistency among the atomic variables (behave like C/C++'s seqconst atomics), and that you shouldn't mix atomic and non-atomic accesses for a given memory word.

Related answer https://stackoverflow.com/a/58892365/2133484

Arman Ordookhani
  • 6,031
  • 28
  • 41
  • That quote doesn't really address what this question is about. The question here is whether atomic RMWs have acquire and/or release semantics, to make sure all non-atomic operations are visible to another thread whose acquire operation syncs with a release operation in the producer. In asm, it is possible on some ISAs to do a weakly-ordered atomic RMW, but that wouldn't be a good design if there isn't a mechanism like C++'s `std::memory_order` optional parameter to specify relaxed vs. acq_rel or seq_cst. – Peter Cordes Mar 23 '21 at 15:10
  • But hopefully "sequential consistency among the atomic variables" does also imply ordering of non-atomic accesses wrt. those seq_cst operations, like in C++. – Peter Cordes Mar 23 '21 at 15:11
  • 2
    According to the latest version of the documentation, when an atomic write is seen by an atomic read, then the write is synchronized-before that read and hence an happens-before edge is induced. https://go.dev/ref/mem#atomic – pveentjer Aug 30 '22 at 08:11
  • pveentjer good reminder to check latest! @arman-ordookhani the [go docs](https://go.dev/ref/mem#atomic) now say "The preceding definition has the same semantics as C++’s sequentially consistent atomics and Java’s volatile variables." – Josh Hibschman Nov 04 '22 at 13:09
1

Test program:

package main

import (
    "sync/atomic"
)

var n uint32

func main() {
    n = 100
    atomic.AddUint32(&n, 1)
}

Check the assembly by:

go tool compile -S main.go         
"".main STEXT nosplit size=27 args=0x0 locals=0x0 funcid=0x0
    0x0000 00000 (main.go:9)    TEXT    "".main(SB), NOSPLIT|ABIInternal, $0-0
    0x0000 00000 (main.go:9)    FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0000 00000 (main.go:9)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0000 00000 (main.go:10)   MOVL    $100, "".n(SB)
    0x000a 00010 (main.go:11)   MOVL    $1, AX
    0x000f 00015 (main.go:11)   LEAQ    "".n(SB), CX
    0x0016 00022 (main.go:11)   LOCK
    0x0017 00023 (main.go:11)   XADDL   AX, (CX)
    0x001a 00026 (main.go:12)   RET
    0x0000 c7 05 00 00 00 00 64 00 00 00 b8 01 00 00 00 48  ......d........H
    0x0010 8d 0d 00 00 00 00 f0 0f c1 01 c3                 ...........
    rel 2+4 t=15 "".n+-4
    rel 18+4 t=15 "".n+0
go.cuinfo.packagename. SDWARFCUINFO dupok size=0
    0x0000 6d 61 69 6e                                      main
""..inittask SNOPTRDATA size=24
    0x0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0x0010 00 00 00 00 00 00 00 00                          ........
"".n SNOPTRBSS size=4
type..importpath.sync/atomic. SRODATA dupok size=13
    0x0000 00 0b 73 79 6e 63 2f 61 74 6f 6d 69 63           ..sync/atomic
gclocals·33cdeccccebe80329f1fdbee7f5874cb SRODATA dupok size=8
    0x0000 01 00 00 00 00 00 00 00                 

The LOCK instruction is:

Causes the processor’s LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted. In most IA-32 and all Intel 64 processors, locking may occur without the LOCK# signal being asserted. See the “IA-32 Architecture Compatibility” section below for more details. The LOCK prefix can be prepended only to the following instructions and only to those forms of the instructions where the destination operand is a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. If the LOCK prefix is used with one of these instructions and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated. An undefined opcode exception will also be generated if the LOCK prefix is used with any instruction not in the above list. The XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix. The LOCK prefix is typically used with the BTS instruction to perform a read-modify-write operation on a memory location in shared memory environment. The integrity of the LOCK prefix is not affected by the alignment of the memory field. Memory locking is observed for arbitrarily misaligned fields. This instruction’s operation is the same in non-64-bit modes and 64-bit mode.

So yes, it has the memory visibility.

ffcactus
  • 152
  • 8
  • The question is asking about the visibility of *other* variables, e.g. whether release/acquire synchronization between threads using atomics make it safe to read the result of the `a = "hello, world"` non-atomic assignment. – Peter Cordes Nov 25 '21 at 03:07
  • Oh... it's asking that question. Will LOCK creates a "happen before" effect? idk. – ffcactus Nov 25 '21 at 07:18
  • 1
    On x86, yes, it's also a full memory barrier, so `lock inc` is equivalent to C++ `std::atomic` `.fetch_add` with the default `memory_order_seq_cst`. (And x86 loads have "acquire" semantics because of the TSO memory model.) Of course, with x86's strong memory model, even plain `mov` store and plain `mov` load create a release/acquire "synchronizes with", giving you an inter-thread "happens before". As long as the compiler doesn't reorder things at compile time. [how are barriers/fences and acquire, release semantics implemented microarchitecturally?](https://stackoverflow.com/q/58070428) – Peter Cordes Nov 25 '21 at 07:28