-2

From Dave Cheney's article about struct comparaison code generated by Go compiler (https://dave.cheney.net/2020/05/09/ensmallening-go-binaries-by-prohibiting-comparisons):

Padding exists to ensure the correct field alignments, and while it does take up space in memory, the contents of those padding bytes are unknown. You might assume that, being Go, the padding bytes are always zero, but it turns out that’s not the case–the contents of padding bytes are simply not defined. Because they’re not defined to always be a certain value, doing a bitwise comparison may return false because the nine bytes of padding spread throughout the 24 bytes of S [a previously defined struct with padding] may not be the same.

The Go compiler solves this problem by generating what is known as an equality function. In this case S‘s equality function knows how to compare two values of type S by comparing only the fields in the function while skipping over the padding.

EDIT: the same source states that struct {int64, int64} are compared using memory compare, while struct {int64, int8} requires a custom function because of padding, enlarging the resulting binary.

Why doesn't Go compiler solve this by defining padding bytes content, and so it can compare using something like memcmp instead?

EDIT: Is there any overhead in zeroing or comparing one word instead of one byte (e.g.: zeroing and comparing 16 bytes instead of 9 in the previous struct {int64, int8} example)?

Community
  • 1
  • 1
neclepsio
  • 453
  • 3
  • 15
  • 5
    Mainly because memcmp doesn't do what Go needs. memcmp on string fields would not implement what Go requires from a string comparison. It has to use the fields anyway. – Volker May 11 '20 at 14:57
  • 4
    `memcmp` could see a `struct{ int16, int16 }` as equal to a `struct{ int32 }`, even though they can *never* be equal in Go. Padding is irrelevant. – Adrian May 11 '20 at 15:36
  • 2
    I would add to the aswer and the comments that your point of view might be a bit offset by fixating on `memcmp`. A compiler may be able to generate highly effective type-specific comparison code. Even C compilers will try to [inline calls to `memcmp`](https://stackoverflow.com/a/21106815/720999) when they can be sure the semantics of this symbol were not messed up by a programmer (say, by redefining that symbol). Go compilers are free to generate effective comparison code on a type-by-type basis, right away (since Go is much stricter than C when it comes to typing). – kostix May 11 '20 at 16:57
  • 2
    Worth noting: even (or especially?) in C or C++, the padding areas can cause problems: hashing the raw bytes of a struct, for fast lookup, fails to find matching structs when it's field-by-field match that we care about and use later. I fixed a few bugs like this in the last few years... – torek May 11 '20 at 19:58
  • According to the source a struct{int64,int64} is compared using "memcmp". A struct{int8,int64} cannot because of padding is not zeroed out, at least it's not required by the spec. Of course memcmp shouldn't be used on structs that are not comparable, as struct{int64} and struct{int32,int32}, it's a compilation error. memcmp would fail on strings too. The article is about the size of Go binaries growing when there are many comparable structs. A one-fits-all memcmp-like for padded structures could help for that. Zeroing or comparing padding would make a real impact on performance? – neclepsio May 11 '20 at 20:33
  • @neclepsio, what you're willing to discuss in your comment is interesting indeed but it fails to adhere to what's on topic at SO. May I ask you to bring this question on [the mailing list](https://groups.google.com/forum/#!forum/golang-nuts) instead—it is one of the discussion venues wich are just okay for questions requiring open-ended answers and a threaded discussion in general. Thanks. – kostix May 16 '20 at 16:07

1 Answers1

4

From the spec:

Struct values are comparable if all their fields are comparable. Two struct values are equal if their corresponding non-blank fields are equal.

In other words, struct equality is not a simple byte-by-byte comparison. It is a field-by-field comparison using the rules for comparability/equality of each field.

Edit:

Even if padding were zeroed, many structs could still not be compared directly using something like memcmp. Types like strings and interfaces are not memory comparable due to their underlying type representations, even though they are logically comparable and can be equal according to the spec. To see this in action, look at https://play.golang.org/p/lmu-THnWY3W.

If you want to see how struct equality is implemented, check out the source code.

chash
  • 3,975
  • 13
  • 29
  • Ok, let me restate the question. Some structs can be compared by memory compare, and the compiler does that. Padding content is not defined, so those structs can not. So, why not define padding must be zeroed? – neclepsio May 11 '20 at 20:44
  • @neclepsio I can't speak for the Go authors, but maybe because it still wouldn't allow memory-based comparison of all structs, like those containing strings, interfaces, other structs containing those things, etc. It seems like an edge case and zeroing the padding likely has some measurable overhead cost. – chash May 11 '20 at 22:46
  • Thank you, that's what I think too, but I don't understand how, for example, on a 64-bit architecture it could take different time to zero or compare int64s or int8s. – neclepsio May 12 '20 at 09:59