18

As most C programmers know, you can't directly compare two structures.

Consider:

void isequal(MY_STRUCT a, MY_STRUCT b)
{
    if (a == b)
    {
        puts("equal");
    }
    else
    {
        puts("not equal");
    }
 }

The a==b comparison will AFAIK throw a compile error on any sensible C compiler, because the C standard doesn't allow for built-in structure comparison. Workarounds using memcmp are of course a bad idea due to alignment, packing, bitfields etc., so we end up writing element by element comparison functions.

On the other hand it DOES allow for structure assignment e.g. a = b is entirely legal. Clearly the compiler can cope with that fairly trivially, so why not comparison?

The only idea I had was that structure assignment is probably fairly close to memcpy(), as the gaps due to alignment etc. don't matter. On the other hand, a comparison might be more complicated. Or is this something I'm missing?

Obviously, I'm aware that doing a simple element by element comparison isn't necessarily enough, e.g. if the structure contains a pointer to a string, but there are circumstances where it would be useful.

WillW
  • 871
  • 6
  • 18
  • 1
    Vote to close: I think this is a truly interesting question, but at the same time, I think you're unlikely to get any answers that aren't mere speculation. – Oliver Charlesworth Aug 24 '11 at 19:55
  • http://stackoverflow.com/questions/141720/how-do-you-compare-structs-for-equality-in-c – Ciro Santilli OurBigBook.com Jul 02 '15 at 07:54
  • how would you handle void pointers if each struct allocated memory with two different calls to malloc – eat_a_lemon May 29 '17 at 18:48
  • Void pointers can be compared for equality when they're on their own. If a struct contained a void pointer that was different from a void pointer in another struct, I would expect them to compare as unequal. – WillW May 29 '17 at 22:03

5 Answers5

14

As others have mentioned, here's an extract from C: A Reference Manual by Harbison and Steele:

Structures and unions cannot be compared for equality, even though assignment for these types is allowed. The gaps in structures and unions caused by alignment restrictions could contain arbitrary values, and compensating for this would impose an unacceptable overhead on the equality comparison or on all operations that modified structure and union types.

Maxim Chetrusca
  • 3,262
  • 1
  • 32
  • 28
  • Wouldn't the potential gaps in the structures for alignment cause exactly the same performance problems when assigning two structs as when comparing them? – jcarpenter2 Oct 29 '17 at 02:56
  • 4
    I personally don't buy that argument, but I guess the reasoning is the following: For copying/assigning a struct the generated code is rather simple even though potentially slow: it just needs to copy the whole memory range of the struct and can include the undefined contents of the gaps. When comparing it would have to discern the gaps from the actual fields (if that's the desired behavior), which would require more complicated code. – stefanct Oct 30 '17 at 07:54
9

Comparison is unsupported for the same reason memcmp fails.

Due to padding fields the comparison would fail in unpredictable ways which would be unacceptable for most programmers. Assignment changes the invisible padding fields, but these are invisible anyway, so nothing unexpected there.

Obviously, you may ask: so why doesn't it just zero-fill all the padding fields ? Sure that would work but it would also make all programs pay for something they might not need.

EDIT

Oli Charlesworth notes in the comments that you may be asking: "why doesn't the compiler generate code for member-by-member comparison". If that is the case, I must confess: I don't know :-). The compiler would have all the needed information if it would only allow comparing complete types.

Community
  • 1
  • 1
cnicutar
  • 178,505
  • 25
  • 365
  • 392
  • 2
    The OP is explicitly suggesting *not* to use `memcmp`, but for the compiler to automatically generate code for member-by-member comparison. – Oliver Charlesworth Aug 24 '11 at 16:53
  • @Oli Charlesworth Perhaps you're right, but I didn't read it as "why doesn't the compiler do it member-by-member". I don't think there's an easy answer for that, the compiler certainly has all the information it needs if it only allows comparison between complete types. – cnicutar Aug 24 '11 at 16:57
  • In addition to padding bits, there may also in some implementations be issues of positive and negative zero. If there are two structures which are identical except that one has a positive zero where the other has a negative zero, there may be cases in which they should be regarded as equal, and other cases where they should not. It would be nice to have syntax for asking the compiler to perform a memberwise comparison (which it could then replace with a memory-compare in cases where that would work) but I'm not sure I'd like == or != for that purpose. – supercat Aug 24 '11 at 17:02
  • 1
    But what exactly is member-by-member comparison? Are two structs containing strings not equal if the strings match, but are not identical? I guess the committee wanted to avoid such problems and let the user do the comparison, since the user knows best what to compare. – Rudy Velthuis Aug 24 '11 at 17:31
  • 1
    @Rudy Velthuis: The most natural way would be to define it as comparing each corresponding member using the same operator (`==` or `!=`). This would mean that if your structs contained pointers to strings, it would be the pointers (and not what they point to) that would be compared for equality. You would also need to define `==` and `!=` for array types, presumably in a similar manner by comparing each corresponding element. The real reason it's not there is probably just because comparing structs for strict equality is only rarely needed. – caf Aug 25 '11 at 01:53
6

I found this in the C rationale (C99 rationale V5.10), 6.5.9:

The C89 Committee considered, on more than one occasion, permitting comparison of structures for equality. Such proposals foundered on the problem of holes in structures. A byte-wise comparison of two structures would require that the holes assuredly be set to zero so that all holes would compare equal, a difficult task for automatic or dynamically allocated variables.

The possibility of union-type elements in a structure raises insuperable problems with this approach. Without the assurance that all holes were set to zero, the implementation would have to be prepared to break a structure comparison into an arbitrary number of member comparisons; a seemingly simple expression could thus expand into a substantial stretch of code, which is contrary to the spirit of C

In plain English: Since structs/unions may contain padding bytes, and the committee had not enforced these to hold certain values, they wouldn't implement this feature. Because if all padding bytes must be set to zero, it would require additional run-time overhead.

stefanct
  • 2,503
  • 1
  • 28
  • 32
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    I would personally call this rationale nonsense, since the C standard already has other very similar "resource-heavy" requirements: - Structs/unions of static storage duration must have padding bytes set to zero (C11 6.7.9/10). - All non-initialized members of a partially initialized struct must be set to zero (C11 6.7.9/19). – Lundin Nov 01 '17 at 14:13
  • 2
    No, this rationale is entirely consistent. Initialisation of static things to zero is cheap, and it's only done once at startup. It's very cheap to do because the entire memory segment is zeroed out (before some of it is initialised), which you either want to do anyway for security reasons (hosted systems) or is done by your hardware at boot (embedded systems). It's same reason why static variables are initialised to zero and local variables aren't - performance of things done repeatedly at runtime is much more important. – WillW Nov 02 '17 at 10:09
  • 2
    @WillW No, .bss does not get "initialized by hardware" on embedded systems... And my first argument about zero initialization of partially initialized structs apply to local variables as well as static storage duration ones - this is not necessarily done by .bss copy-down but might as well happen in run-time. – Lundin Nov 02 '17 at 14:13
  • 1
    While not as general a rule as I made it sound, a significant number of embedded targets specify that all non-register memory addresses will be zero at reset. – WillW Nov 02 '17 at 17:16
5

Auto-generate comparison operator is bad idea. Imagine how comparison would work for this structure:

struct s1 {
   int len;
   char str[100];
};

This is pascal like string with maximum length 100

Another case

struct s2 {
   char a[100];
}

How can the compiler know how to compare a field? If this is a NUL-terminated string, the compiler must use strcmp or strncmp. If this is char array compiler must use memcmp.

Jens
  • 69,818
  • 15
  • 125
  • 179
vromanov
  • 881
  • 6
  • 11
  • 1
    Indeed. Even if the array is supposed to contain bytes, comparing in s1 should only be up to len characters. The user knows that, but the compiler can't, since it has no idea about the semantics of `len`. – Rudy Velthuis Aug 24 '11 at 17:34
  • If the fixed-sized fields are maintained using the routines like strncpy (which in many cases they should be), identical strings will compare identically. While leaving residue from old strings would in most cases be harmless, it could in some case have security implications; it's cleaner to simply zero-fill any fixed-length buffers. – supercat Aug 24 '11 at 17:36
  • 3
    Hm, not very convincing. Nobody would expect the comparison to follow any semantics of the type. Just component wise comparison would be ok. What is convincing that there is an simple legacy syntax problem that doesn't allow us to compare arrays. So other composite types that contain arrays can't be compared either. – Jens Gustedt Aug 24 '11 at 18:07
  • This comparison is trivial. There is no need to understand the semantics of what is contained in the string in order to compare. The only reason you need to do that normally is that you are only passed a pointer to the start of a string with no idea of length, so you iterate until NULL is seen. In this case the compiler knows the size of the data, and can do the appropriate comparison for all elements. This might mean that two strings that strcmp the same compare differently, but that is something the user should be aware of. – WillW Aug 25 '11 at 11:00
  • "two strings that strcmp the same compare differently". No feature wins over a dangerous feature any day of the week, hands down. – n. m. could be an AI Jul 31 '14 at 08:15
  • 2
    But interpreting arrays of char as strings wasn't the question, was it? When you try to compare structs that contain `char[]` nobody expects the compiler to make the value 0x00 special. – harper Nov 05 '14 at 09:41
4

To add to the existing good answers:

struct foo {
    union {
        uint32_t i;
        float f;
    } u;
} a, b;
a.u.f = -0.0;
b.u.f = 0.0;
if (a==b) // what to do?!

The problem arises inherently from unions not being able to store/track which member is current.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • That's only a problem because comparison of structs is not specified to begin with (i.e., comparing structs (recursively) containing unions would need to be tackled by the specification, e.g., by declaring it UB) but it's a very interesting aspect, thank you. – stefanct Jul 06 '18 at 12:43
  • 3
    This is the right answer IMO: unions are the real reason why you can't generate code for equality, not runtime overhead. –  Aug 27 '19 at 12:57