1

Assuming no string less than 4 bytes is ever passed, is there anything wrong with this optimization? And yes it is a significant speedup on the machines I've tested it on when comparing mostly dissimilar strings.

#define STRCMP(a, b) ( (*(int32_t*)a) == (*(int32_t*)b) && strcmp(a, b) == 0)

And assuming strings are no less than 4 bytes, is there a faster way to do this without resorting to assembly, etc?

Edwin Skeevers
  • 309
  • 1
  • 8
  • Your integer comparison is comparing the *pointers*, not what data they might point to. That means `char s[] = "foo"; STRCMP(s, "foo")` will not work as you expect. Your "speedup" is either due to bad testing of from some other unknown cause. – Some programmer dude Jan 24 '23 at 16:07
  • 1
    Also I hope you benchmark, measure and profile an *optimized* build. – Some programmer dude Jan 24 '23 at 16:11
  • Please don't update question in response to comments and answers, unless they are typos in the question itself. If you "fix" the code then the question no longer makes sense and you might as well delete it (but I don't think you should in your case, others will have use of it). – Some programmer dude Jan 24 '23 at 16:12
  • 1
    I think you will find that this is how most strcmp implementations work, by comparing chunks – pm100 Jan 24 '23 at 16:13
  • 1
    @pm100: That's actually non-trivial. `strcmp` has to consider `"a\0X"` and `"a\0Y"` equal, even though the two `char[4]` are clearly unequal. – MSalters Jan 24 '23 at 16:24
  • 1
    @msalters never said it was trivial, what I mean is that c library implementers will have optimized these heavily used functions alreay – pm100 Jan 24 '23 at 17:04
  • Aside from alignment issues other have commented on, it doesn't return the same values as strcmp() which can return strictly positive and negative values. It would be better therefore, to give it a different name e.g. AREEQUALSTRINGS(). – Simon Goater Jan 24 '23 at 18:36
  • _"Assuming..."_ ... Never assume... – Fe2O3 Jan 24 '23 at 19:02

3 Answers3

6

Casting the address of a char array to an int *and dereferencing it is always a strict aliasing violation in addition to possibly violating alignment restrictions.

Example

See UDP checksum calculation not working with newer version of gcc for just one example of the dangers of strict aliasing violations.

Note that C implementations themselves are free to make use of undefined behavior internally. The implementers have knowledge and complete control over the implementation, neither of which someone using someone else's compiler will in general have.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • Out of interest, is doing it the other way round okay - casting an int * to a char * and dereferncing it? – Simon Goater Jan 24 '23 at 18:48
  • 1
    @SimonGoater Accessing the bytes of any object via a character pointer [is well-defined](http://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7): "When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object." – Andrew Henle Jan 24 '23 at 20:06
5

*(int32_t*)a assumes that a is 4-byte aligned. That's in general not the case.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 2
    Note that if any alignment criteria are not met, just `(int32_t*)a` is sufficient to invoke undefined behavior - there's [no dereference necessary](https://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7): "A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined." – Andrew Henle Jan 24 '23 at 16:17
  • 1
    @AndrewHenle: Correct. C++ doesn't have the notion of "which Undefined Behavior happens first". This example has alignment and aliasing errors with both `a` and `b`, but none of the 4 can be said to come first. – MSalters Jan 24 '23 at 16:21
  • 1
    I didn't mean to imply there was any ordering. I was just pointing out that merely doing a cast to another type can invoke UB if the alignment restrictions for that type are violated. – Andrew Henle Jan 24 '23 at 16:25
1

is there anything wrong with this optimization?

Alignment

Yes, (int32_t*)a risks undefined behavior due to a not meeting int * alignment.

Inverted meaning

strcmp() returns 0 on match. STRCMP() returns 1 on match. Consider alternatives like STREQ().

Multiple and inconsistent a evaluations

Consider STRCMP(s++, t). s will get incremented 1 or 2 times.


And assuming strings are no less than 4 bytes, is there a faster way to do this without resorting to assembly, etc?

Test 1 character

Try profiling the below. Might not be faster than OP's UB code, but faster than strcmp().

//#define STRCMP(a, b) ( (*(int32_t*)a) == (*(int32_t*)b) && strcmp(a, b) == 0)
#define STREQ(a, b) ( (*(unsigned char *)a) == (*(unsigned char *)b) && strcmp(a, b) == 0)

Step back and look at the larger picture for performance improvements.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256