12

Suppose I have this:

go func() {
    for range time.Tick(1 * time.Millisecond) {
        a, b = b, a
    }
}()

And elsewhere:

i := a // <-- Is this safe?

For this question, it's unimportant what the value of i is with respect to the original a or b. The only question is whether reading a is safe. That is, is it possible for a to be nil, partially assigned, invalid, undefined, ... anything other than a valid value?

I've tried to make it fail but so far it always succeeds (on my Mac).

I haven't been able to find anything specific beyond this quote in the The Go Memory Model doc:

Reads and writes of values larger than a single machine word behave as multiple machine-word-sized operations in an unspecified order.

Is this implying that a single machine word write is effectively atomic? And, if so, are function pointer writes in Go a single machine word operation?

Update: Here's a properly synchronized solution

nicerobot
  • 9,145
  • 6
  • 42
  • 44
  • I think so. Take a look at this: (https://play.golang.org/p/b-fyvCiR7b) The size of the pointer is always 4 bytes. In a 32bits processor, the word size is 32 bits (4 bytes). Obviously in a 64bits processor you have 8 bytes words. So based on that and the snippet you posted from the documentation I'd say it's safe. – Amir Keibi Dec 31 '16 at 08:34
  • @AmirKeibi The important thing to note is that the _guarantee_ from the docs is **not** saying a _single machine word operation_ is atomic. It's only saying operations larger than a single machine word are unordered. In reality, regardless of whether a read and write is a single machine word operation, there is still no guarantee (and no way for Go to guarantee) that the operation is atomic. It's hardware dependent, undefined from Go's perspective, and therefore, requires synchronization. – nicerobot Dec 31 '16 at 20:29
  • 1
    I think the question title is a bit misleading. Concurrent reads are safe, but everything collapses when at least one write operation is involved. – Deleplace May 02 '17 at 12:06

3 Answers3

25

Unsynchronized, concurrent access to any variable from multiple goroutines where at least one of them is a write is undefined behavior by The Go Memory Model.

Undefined means what it says: undefined. It may be that your program will work correctly, it may be it will work incorrectly. It may result in losing memory and type safety provided by the Go runtime (see example below). It may even crash your program. Or it may even cause the Earth to explode (probability of that is extremely small, maybe even less than 1e-40, but still...).

This undefined in your case means that yes, i may be nil, partially assigned, invalid, undefined, ... anything other than either a or b. This list is just a tiny subset of all the possible outcomes.

Stop thinking that some data races are (or may be) benign or unharmful. They can be the source of the worst things if left unattended.

Since your code writes to the variable a in one goroutine and reads it in another goroutine (which tries to assign its value to another variable i), it's a data race and as such it's not safe. It doesn't matter if in your tests it works "correctly". One could take your code as a starting point, extend / build on it and result in a catastrophe due to your initially "unharmful" data race.

As related questions, read How safe are Golang maps for concurrent Read/Write operations? and Incorrect synchronization in go lang.

Strongly recommended to read the blog post by Dmitry Vyukov: Benign data races: what could possibly go wrong?

Also a very interesting blog post which shows an example which breaks Go's memory safety with intentional data race: Golang data races to break memory safety

Community
  • 1
  • 1
icza
  • 389,944
  • 63
  • 907
  • 827
  • 2
    As usually, very well put. I would also recommend the OP to read [this piece](http://preshing.com/20130618/atomic-vs-non-atomic-operations/) for to get better grasp of the underlying H/W issues. To sum it up for the OP, there exist two problems "below" the code as they write it in the text editor: 1) the compiler can produce machine code which does "strange" things with memory location of your variables; 2) multi-CPU and/or -core hardware has cache coherency issues: when a CPU reads a value from the memory, it's not necessarily reading it from the same place another CPU wrote it to. – kostix Dec 31 '16 at 15:13
  • 2
    Oh, and the third problem: 3) Contemporary CPUs on widely used H/W platforms routinely perform reordering of memory accesses. All these three issues only "break things" when the programmer has expectations besides what's guaranteed by the language's memory model, so please don't have them. ;-) – kostix Dec 31 '16 at 15:17
  • _Stop thinking that some data races are (or may be) benign or unharmful._ 1) At one level, what the value of `i` is, a race is benign since I don't care whether it's `a` or `b`. 2) I wouldn't have asked the question if i thought the underlying issue of this being undefined behavior was benign. 3) The main question, which is mostly answered in the @kostix and "what could go wrong?" link is how the hardware handles move and store operations concurrently. I wasn't seeking a commentary on your assumptions about my understanding of concurrency. I was seeking a clarification on Go's Memory Model. – nicerobot Dec 31 '16 at 19:19
  • To clarify _point 1_ in my prior comment: The algorithm is designed to not care about the value in `i`. In the real algorithm, it is designed to actually randomize the assignment-timing to `a` from dozens of values randomly. All that matters is that the read will be a valid value. What it seems to boil down to is, in the compiled code, is the `mov` of the read always going to have a consistent view of the memory its reading? Can the `mov` into `a`'s memory overlap the read such that the read is reading a partial write? I think the answer is, it's undefined because it's hardware dependent. – nicerobot Dec 31 '16 at 19:56
  • 2
    @nicerobot That's my point. You _don't_ have that guarantee that `i`'s value will be valid, that's what _undefined_ means. Your program may even _crash_, I guess you don't take that as being benign. It _may_ be that with the current compiler, hardware and generated code you won't experience this, but you don't have the guarantee. It may even be that the next version of Go compiler will generate a different, "optimized" code which will "misbehave" according to what you wanted your program to do – just because you left data races in your code. – icza Dec 31 '16 at 20:05
  • @icza Yep, i don't actually care if the app panics during the read (if `recover` can trap it and the panic is infrequent). I just wanted the clarification of Go's guarantees pertaining to the _single machine word operations_. At any rate, I think this _being undefined_ is the "safe" answer, and [@kostix](http://preshing.com/20130618/atomic-vs-non-atomic-operations/)'s link provides the best clarification for those new to concurrency. – nicerobot Dec 31 '16 at 20:12
  • 1
    @nicerobot, consider using `sync/atomic` operations to access your value: you won't get any ordering guarantees (and you don't need them) but you will be safe in terms of the memory model as these functions ensure proper memory fencing where required. You could search for a recent thread on the `golang-nuts` mailing list dealing exactly with this issue. – kostix Dec 31 '16 at 20:14
  • @kostix Thanks. It's appreciated. I'm well versed in concurrency and Go's concurrency primitives and options in the std lib. I was only seeking a clarification of the memory model documentation and that specific Go guarantee of _single machine word operations_. Regardless, great responses all around and will hopefully be helpful to others. And your link above is one of the clearest i've read on the subject in a long time. Thanks again. – nicerobot Dec 31 '16 at 20:22
  • @icza Here's the thing, targeting amd64, `i := a` compiles to a `MOVQ`. Even in parallel pipelines, I think that won't ever result in an inconsistent value. You can't know which value you'll get but because it's a single instruction I think it will always get a consistent view of memory. CPU and memory systems certainly enforce that. In other words, while I agree that it's not, in general, a safe operation, if i had restricted the question to amd64, i think it can be said it will be safe. – nicerobot Jan 01 '17 at 15:43
  • 1
    @nicerobot, FYI, that's [the thread](https://groups.google.com/d/topic/golang-nuts/I-p5vmyln9c/discussion) I referred to, and in particular, [this response](https://groups.google.com/d/msg/golang-nuts/I-p5vmyln9c/zRdn4NBEAwAJ) from one of the core team folks dealing with `Load*()` and `Store*()` of `sync/atomic` and happens-before guarantees they offer according to the memory model; it also mentions an interesting [issue](https://golang.org/issue/5045). Hope this will be of interest to you. – kostix Jan 02 '17 at 10:20
  • 1
    @nicerobot, as to «it's okay on amd64», my take is that you should extend this definition a bit: «it's apparently okay on amd64 with the `gc` toolset version 1.X.Y». Sure real-world code *does* rely on *implied* quirks of particular H/W platforms and compiler suites, but this should be well documented in the code if you intend to actually maintain it (as opposed to using it for some one-off calculation). Otherwise something like that dreaded [Debian's OpenSSL debacle](https://research.swtch.com/openssl) could happen. – kostix Jan 02 '17 at 10:25
5

In terms of Race condition, it's not safe. In short my understanding of race condition is when there're more than one asynchronous routine (coroutines, threads, process, goroutines etc.) trying to access the same resource and at least one is a writing operation, so in your example we have 2 goroutines reading and writing variables of type function, I think what's matter from a concurrent point of view is those variables have a memory space somewhere and we're trying to read or write in that portion of memory.

Short answer: just run your example using the -race flag with go run -race or go build -race and you'll see a detected data race.

Yandry Pozo
  • 4,851
  • 3
  • 25
  • 27
  • That's a good point. But I don't think his question was about race conditions. "That is, is it possible for i to be nil, partially assigned, invalid, undefined, ... anything other than either a or b?" – Amir Keibi Dec 31 '16 at 08:43
  • 2
    _The only question is whether assigning to i is safe_ – Yandry Pozo Dec 31 '16 at 08:47
  • Not really, he clarifies what the question is: "anything other than either a or b?" – Amir Keibi Dec 31 '16 at 08:48
  • 1
    @AmirKeibi And since there is a race condition, the answer is yes. – nos Dec 31 '16 at 08:52
  • My question isn't really about a data race. While `-race` does indeed show a race, i don't care about it if the value is always guaranteed to valid. I believe this ultimately is more about a hardware question and whether it's ever possible for a value written to an address to be in a state that's invalid while a concurrent read is occurring. – nicerobot Dec 31 '16 at 18:52
  • In the end, I think the best advice is to respect the `-race` :) – nicerobot Dec 31 '16 at 20:00
1

The answer to your question, as of today, is that if a and b are not larger than a machine word, i must be equal to a or b. Otherwise, it may contains an unspecified value, that is most likely to be an interleave of different parts from a and b.

The Go memory model, as of the version on June 6, 2022, guarantees that if a program executes a race condition, a memory access of a location not larger than a machine word must be atomic.

Otherwise, a read r of a memory location x that is not larger than a machine word must observe some write w such that r does not happen before w and there is no write w' such that w happens before w' and w' happens before r. That is, each read must observe a value written by a preceding or concurrent write.

The happen-before relationship here is defined in the memory model in the previous section.

The result of a racy read from a larger memory location is unspecified, but it is definitely not undefined as in the realm of C++.

Reads of memory locations larger than a single machine word are encouraged but not required to meet the same semantics as word-sized memory locations, observing a single allowed write w. For performance reasons, implementations may instead treat larger operations as a set of individual machine-word-sized operations in an unspecified order. This means that races on multiword data structures can lead to inconsistent values not corresponding to a single write. When the values depend on the consistency of internal (pointer, length) or (pointer, type) pairs, as can be the case for interface values, maps, slices, and strings in most Go implementations, such races can in turn lead to arbitrary memory corruption.

Quân Anh Mai
  • 396
  • 2
  • 6