11

If I have two C structures initialised to have identical members, can I guarantee that:

memcmp(&struct1, &struct2, sizeof(my_struct))

will always return zero?

simonc
  • 41,632
  • 12
  • 85
  • 103
Sparky
  • 2,694
  • 3
  • 21
  • 31

5 Answers5

10

I don't think you can safely memcmp a structure to test for equality.

From C11 §6.2.6.6 Representations of types

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

This implies that you'd need to write a function which compares individual elements of the structure

int my_struct_equals(my_struct* s1, my_struct* s2)
{
    if (s1->intval == s2->intval &&
        strcmp(s1->strval, s2->strval) == 0 && 
        s1->binlen == s2->binlen &&
        memcmp(s1->binval, s2->binval, s1->binlen) == 0 &&
        ...
        ) {
        return 1;
    }
    return 0;
}
simonc
  • 41,632
  • 12
  • 85
  • 103
9

No, two structures with all members equal may sometimes not compare equal for memcmp(), because of padding.

One plausible example is as follows. For the initialization of st2, a standard-compliant 32-bit compiler could generate a sequence of assembly instructions that leave part of the final padding uninitialized. This piece of padding will contain whatever happened to be there on the stack, whereas st1's padding will typically contain zero:

struct S { short s1; long long i; short s2; } st1 = { 1, 2, 3 };
int main() {
  struct S st2 = { 1, 2, 3 };
  ... at this point memcmp(&st1, &st2, sizeof(struct S)) could plausibly be nonzero
}
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
2

If both variables are global or static, and their members were initialized at init time of the program, then yes, they will compare equal with memcmp(). (Note, most systems just load the data pages into zero initialized pages, but the C standard does not guarantee this behavior.)

Also, if one of the structures were initialized with the other using memcpy(), then they will compare equal with memcmp().

If both were initialized to some common value with memset() first before their members are initialized to the same values, then they will also compare equal with memcmp() (unless their members are also structures, then the same restrictions apply recursively).

jxh
  • 69,070
  • 8
  • 110
  • 193
  • I didn't downvote but I don't think C11 s6.7.9.10 backs this up the first paragraph. It reads as if initialisation of statics/globals is a series of assignments of members to 0 or NULL. If this is correct, s6.2.6.6 suggests that any padding bytes would have undefined value – simonc May 29 '13 at 14:17
2

Beside the obvious case of struct padding, it is not even guaranteed for single variables. See the footnote for 6.2.6.1 (8):

It is possible for objects x and y with the same effective type T to have the same value when they are accessed as objects of type T, but to have different values in other contexts. In particular, if == is defined for type T, then x == y does not imply that memcmp(&x, &y, sizeof (T)) == 0. Furthermore, x == y does not necessarily imply that x and y have the same value; other operations on values of type T may distinguish between them.

Secure
  • 4,268
  • 1
  • 18
  • 16
  • This is a great point. A concrete example of this is floating point types where there are +'ve and -'ve 0 values that compare equal but have different bit patterns. – Michael Anderson May 31 '13 at 11:29
  • Its probably worth noting that floats also provide the opposite case: i.e. take two doubles `x=y=1.0/0` they compare as unequal, `x!=y`, but have `memcmp(&x,&y,sizeof(double))==0`. – Michael Anderson May 31 '13 at 14:15
-1

You can guarantee that they're identical if you ensure that both entire memory blocks are initialised before they're populated, e.g. with memset:

memset(&struct1, 0, sizeof(my_struct))

EDIT leaving this here because the comment stream is useful.

Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • 2
    that's not backed by language semantics - padding bytes always take unspecified values, which in particular means that compilers are free to overwrite ambient padding when assigning to members; no idea if that happens in practice, though... – Christoph May 29 '13 at 14:24
  • @Christoph: Are you sure about that? If so, I don't even want to think about how many protocol stacks will just suddenly break on such a system. – jxh May 29 '13 at 14:34
  • 1
    @user315052: *When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values* (C11 6.2.6.1 §6) – Christoph May 29 '13 at 14:40
  • 1
    @Christoph: I read the footnote, though, and the intent of that phrase was to allow structure assignment to be implemented with `memcpy()`. – jxh May 29 '13 at 14:42
  • @user315052: the footnote reads *Thus, for example, structure assignment need not copy any padding bits.*, which is more or less the opposite of your claim, ie structure assignment can be implemented without `memcpy()`ing the whole blob – Christoph May 29 '13 at 14:44
  • @Christoph: I was reading footnote 42 in C99. I don't have C11 in front of me at the moment. – jxh May 29 '13 at 14:46
  • it's easy to come up with other reasons for this restriction, eg on architectures where the size of a logically addressable unit of memory (byte) is smaller than the unit of physically addressable memory (word), ie where byte-wise access has to be emulated via shifts and such; having to keep padding intact would mean that we had to read the padding every time we want to write to a less-than-word-sized member – Christoph May 29 '13 at 14:46
  • @Christoph: C already allows for a `char` to be more than 8 bits, so I don't believe such a system would be implemented that way. – jxh May 29 '13 at 14:48
  • @user315052: maybe I want to run a POSIX system on a DCPU-16 ;); even if my example is contrieved, I do believe my point stands going by guarantees made by the standard alone – Christoph May 29 '13 at 15:10
  • @Christoph: Reading that phrase again, it is not clear to me if "including in a member object" means the object is a member object, or if the value is being stored to a member of the object. That is, it is not clear from the sentence if the value itself is of structure or union type. It would make sense if a structure is a member of another structure, then structure assignment to the member may affect padding to the containing structure. I am not sure if I can buy assignment to other members, since they should be governed by the rules already set out earlier in the same section. – jxh May 29 '13 at 15:49
  • @user315052 I was going to point out the same ambiguity. It's unclear whether the phrase "value stored" relates to the value stored in the _entire structure_ (i.e. as a single assignment), or to storing values within individual members of that structure. – Alnitak May 29 '13 at 15:53
  • @Alnitak: it's not as clear as it could be, but the gist is that storing into a structure invalidates padding, regardless of whether you access the structure as a whole via assignment (`a = b`) or only target a specific member (`a.foo = 42`), which is what the part *including in a member object* refers to – Christoph May 29 '13 at 16:17
  • 2
    note that after having re-read the relevant parts of the standard, my claim that the value of padding bytes is *always* unspecified appears to be incorrect - it only gets invalidated by storage into the structure (assignment to the structure or its members) - if you do all your modifying manipulations byte-wise (cast to `char*`, `memcpy()`, ...), padding bytes should retain their values – Christoph May 29 '13 at 16:27